gmr / flatdict

Python module for interacting with nested dicts as a single level dict with delimited keys.
https://flatdict.readthedocs.io
BSD 3-Clause "New" or "Revised" License
112 stars 32 forks source link

Version 3 breaks Pandas! #17

Closed jgostick closed 6 years ago

jgostick commented 6 years ago

Not sure if this is Pandas problem or yours, but passing a FlatDict object to the Pandas DataFrame constructor started breaking my Travis builds last night (took me hours to figure out that you updated the version last night :-( )

Anyway, here is a simple example:

>>> import flatdict as fd
>>> import pandas as pd
>>> a = {'top': {'middle': {'data1': 1, 'bottom': {'data2': 3}}}}
>>> b = fd.FlatDict(a)
>>> df = pd.DataFrame(b)
ValueError: DataFrame constructor not properly called!

Also, regarding the change to as_dict, this means that I can't pass Pandas a normal dict with the flat keys since it's no longer available. Maybe this should be an argument to the as_dict method?

gmr commented 6 years ago

that probably needs to be either pd.DataFrame(b.iteritems()) or pd.DataFrame(b.as_dict()) depending on what you're looking for.

iteritems will give you the flat keys, as_dict will give you a nested dict structure.

3.0 changed the base class from dict to collections.MutableMapping which is probably what Pandas does not like, though it should be usable as a iterator and can even be cast to a dict now: pd.DataFrame(dict(b))

jgostick commented 6 years ago

The casting to dict would be nice, but I just checked and it does not work (got ValueError: If using all scalar values, you must pass an index, which might be due to my simple example above). The iteritems does work so I'll use that. (The as_dict also does not work anymore since pandas doesn't quite know how to handle nested data without a bit of help from the user).

I'm not sure why you moved to the MutableMapping, but IMHO having it behave more like a normal dict was an asset...so a dict acts like a dict.

gmr commented 6 years ago

Having it extend dict was making it misbehave in other use-cases. collections.MutableMapping is supposed to be the proper way to implement dict like classes in the userspace, at least from what I've read: https://docs.python.org/3/glossary.html#term-mapping

Ultimately the change allowed me to cleanly address #15 and #16. Changing it back breaks the test cases and the ability to pickle/unpickle the instance.

gmr commented 6 years ago

Errors extending dict:

======================================================================
FAIL: test_cast_to_dict (tests.FlatDictTests)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/Users/gavinr/Source/Libraries/flatdict/tests.py", line 143, in test_cast_to_dict
    self.assertDictEqual(value, self.FLAT_EXPECTATION)
AssertionError: {} != {'foo:bar:baz': 0, 'thud': 5, 'xyzzy': 'pl[303 chars]': 2}
- {}
+ {'foo:bar:baz': 0,
+  'foo:bar:corge': 2,
+  'foo:bar:qux': 1,
+  'foo:grault:baz': 3,
+  'foo:grault:corge': 5,
+  'foo:grault:qux': 4,
+  'foo:list': ['F', 'O', 'O'],
+  'foo:set': {10, 20, 30},
+  'foo:tuple': ('F', 0, 0),
+  'fred': 4,
+  'garply:bar': 1,
+  'garply:baz': 2,
+  'garply:foo': 0,
+  'garply:qux:corge': 3,
+  'thud': 5,
+  'waldo:fred': 6,
+  'waldo:wanda': 7,
+  'xyzzy': 'plugh'}

======================================================================
FAIL: test_eq (tests.FlatDictTests)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/Users/gavinr/Source/Libraries/flatdict/tests.py", line 196, in test_eq
    self.assertEqual(self.value,  self.value.copy())
AssertionError: "{'fo[318 chars] '5', 'waldo:fred': '6', 'waldo:wanda': '7', 'xyzzy': 'plugh'}" != "{'fo[318 chars] '5', 'waldo:fred': '6', 'waldo:wanda': '7', 'xyzzy': 'plugh'}"

======================================================================
FAIL: test_pickling (tests.FlatDictTests)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/Users/gavinr/Source/Libraries/flatdict/tests.py", line 279, in test_pickling
    self.assertEqual(pickle.loads(pickled), self.value)
AssertionError: "{'fo[318 chars] '5', 'waldo:fred': '6', 'waldo:wanda': '7', 'xyzzy': 'plugh'}" != "{'fo[318 chars] '5', 'waldo:fred': '6', 'waldo:wanda': '7', 'xyzzy': 'plugh'}"

======================================================================
FAIL: test_pop_top (tests.FlatDictTests)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/Users/gavinr/Source/Libraries/flatdict/tests.py", line 231, in test_pop_top
    self.assertEqual(expectation, self.value.pop('foo'))
AssertionError: "{'ba[117 chars]F', 'O', 'O']", 'set': '{10, 20, 30}', 'tuple': "('F', 0, 0)"}" != "{'ba[117 chars]F', 'O', 'O']", 'set': '{10, 20, 30}', 'tuple': "('F', 0, 0)"}"

======================================================================
FAIL: test_update (tests.FlatDictTests)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/Users/gavinr/Source/Libraries/flatdict/tests.py", line 270, in test_update
    self.assertEqual(self.value, expectation)
AssertionError: "{'fo[363 chars] '5', 'waldo:fred': '6', 'waldo:wanda': '7', 'xyzzy': 'plugh'}" != "{'fo[363 chars] '5', 'waldo:fred': '6', 'waldo:wanda': '7', 'xyzzy': 'plugh'}"

======================================================================
FAIL: test_cast_to_dict (tests.FlatterDictTests)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/Users/gavinr/Source/Libraries/flatdict/tests.py", line 143, in test_cast_to_dict
    self.assertDictEqual(value, self.FLAT_EXPECTATION)
AssertionError: {} != {'foo:bar:baz': 0, 'xyzzy': 'plugh', 'foo:[408 chars]: 20}
- {}
+ {'foo:abc:def': True,
+  'foo:bar:baz': 0,
+  'foo:bar:corge': 2,
+  'foo:bar:qux': 1,
+  'foo:grault:baz': 3,
+  'foo:grault:corge': 5,
+  'foo:grault:qux': 4,
+  'foo:list:0': 'F',
+  'foo:list:1': 'O',
+  'foo:list:2': 'O',
+  'foo:set:0': 10,
+  'foo:set:1': 20,
+  'foo:set:2': 30,
+  'foo:tuple:0': 'F',
+  'foo:tuple:1': 0,
+  'foo:tuple:2': 0,
+  'fred': 4,
+  'garply:bar': 1,
+  'garply:baz': 2,
+  'garply:foo': 0,
+  'garply:qux:corge': 3,
+  'thud': 5,
+  'waldo:fred': 6,
+  'waldo:wanda': 7,
+  'xyzzy': 'plugh'}

======================================================================
FAIL: test_eq (tests.FlatterDictTests)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/Users/gavinr/Source/Libraries/flatdict/tests.py", line 196, in test_eq
    self.assertEqual(self.value,  self.value.copy())
AssertionError: "{'fo[429 chars] '5', 'waldo:fred': '6', 'waldo:wanda': '7', 'xyzzy': 'plugh'}" != "{'fo[429 chars] '5', 'waldo:fred': '6', 'waldo:wanda': '7', 'xyzzy': 'plugh'}"

======================================================================
FAIL: test_pickling (tests.FlatterDictTests)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/Users/gavinr/Source/Libraries/flatdict/tests.py", line 279, in test_pickling
    self.assertEqual(pickle.loads(pickled), self.value)
AssertionError: "{'fo[429 chars] '5', 'waldo:fred': '6', 'waldo:wanda': '7', 'xyzzy': 'plugh'}" != "{'fo[429 chars] '5', 'waldo:fred': '6', 'waldo:wanda': '7', 'xyzzy': 'plugh'}"

======================================================================
FAIL: test_pop_top (tests.FlatterDictTests)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/Users/gavinr/Source/Libraries/flatdict/tests.py", line 231, in test_pop_top
    self.assertEqual(expectation, self.value.pop('foo'))
AssertionError: "{'ab[200 chars]'set:2': '30', 'tuple:0': 'F', 'tuple:1': '0', 'tuple:2': '0'}" != "{'ab[200 chars]'set:2': '30', 'tuple:0': 'F', 'tuple:1': '0', 'tuple:2': '0'}"

======================================================================
FAIL: test_update (tests.FlatterDictTests)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/Users/gavinr/Source/Libraries/flatdict/tests.py", line 270, in test_update
    self.assertEqual(self.value, expectation)
AssertionError: "{'fo[474 chars] '5', 'waldo:fred': '6', 'waldo:wanda': '7', 'xyzzy': 'plugh'}" != "{'fo[474 chars] '5', 'waldo:fred': '6', 'waldo:wanda': '7', 'xyzzy': 'plugh'}"

Name          Stmts   Miss Branch BrPart  Cover
-----------------------------------------------
flatdict.py     172      1     99      2    99%
----------------------------------------------------------------------
Ran 78 tests in 0.100s

FAILED (failures=10)

Glad to know iteritems fixed it. Are we good to close this then?

jgostick commented 6 years ago

Close it, and thanks!