djsutherland / pummeler

Utilities to analyze ACS PUMS files, especially for distribution regression / ecological inference
MIT License
21 stars 7 forks source link

new subsets feature crashes when zero or one subsets used #19

Closed flaxter closed 7 years ago

flaxter commented 7 years ago

No subsets:

ubuntu@ip-172-31-47-233:~/pummeler$ ./pummel featurize regions
Picking bandwidth by median heuristic...picked 10.0905135193
  0% (      0 of 9222637) |                                                                                              | Elapsed Time: 0:00:00 ETA:  --:--:--Traceback (most recent call last):
  File "./pummel", line 5, in <module>
    main()
  File "/home/ubuntu/pummeler/pummeler/cli.py", line 116, in main
    args.func(args, parser)
  File "/home/ubuntu/pummeler/pummeler/cli.py", line 148, in do_featurize
    subsets=args.subsets)
  File "/home/ubuntu/pummeler/pummeler/featurize.py", line 206, in get_embeddings
    c = c.loc[keep]
  File "/usr/local/lib/python2.7/dist-packages/pandas/core/indexing.py", line 1311, in __getitem__
    return self._getitem_axis(key, axis=0)
  File "/usr/local/lib/python2.7/dist-packages/pandas/core/indexing.py", line 1481, in _getitem_axis
    self._has_valid_type(key, axis)
  File "/usr/local/lib/python2.7/dist-packages/pandas/core/indexing.py", line 1418, in _has_valid_type
    error()
  File "/usr/local/lib/python2.7/dist-packages/pandas/core/indexing.py", line 1405, in error
    (key, self.obj._get_axis_name(axis)))
KeyError: 'the label [True] is not in the [index]'
Closing remaining open files:regions/feats_SC_10_06.h5...done

One subset:

ubuntu@ip-172-31-47-233:~/pummeler$ ./pummel featurize --subsets "SEX == 1" regions 
Picking bandwidth by median heuristic...picked 10.0905135193
  0% (      0 of 9222637) |                                                                                              | Elapsed Time: 0:00:00 ETA:  --:--:--Traceback (most recent call last):
  File "./pummel", line 5, in <module>
    main()
  File "/home/ubuntu/pummeler/pummeler/cli.py", line 116, in main
    args.func(args, parser)
  File "/home/ubuntu/pummeler/pummeler/cli.py", line 148, in do_featurize
    subsets=args.subsets)
  File "/home/ubuntu/pummeler/pummeler/featurize.py", line 206, in get_embeddings
    c = c.loc[keep]
  File "/usr/local/lib/python2.7/dist-packages/pandas/core/indexing.py", line 1311, in __getitem__
    return self._getitem_axis(key, axis=0)
  File "/usr/local/lib/python2.7/dist-packages/pandas/core/indexing.py", line 1481, in _getitem_axis
    self._has_valid_type(key, axis)
  File "/usr/local/lib/python2.7/dist-packages/pandas/core/indexing.py", line 1418, in _has_valid_type
    error()
  File "/usr/local/lib/python2.7/dist-packages/pandas/core/indexing.py", line 1405, in error
    (key, self.obj._get_axis_name(axis)))
KeyError: 'the label [True] is not in the [index]'
Closing remaining open files:regions/feats_SC_10_06.h5...done