wesm / pydata-book

Materials and IPython notebooks for "Python for Data Analysis" by Wes McKinney, published by O'Reilly Media
Other
22.27k stars 15.19k forks source link

Ch 05: Changes to Python 3.6+ dict insertion order sorting and DataFrame instantiation #145

Closed jslatane closed 3 years ago

jslatane commented 3 years ago

On page 134, there is the example of creating a DataFrame from nested dicts:

pop = {'Nevada': {2001: 2.4, 2002: 2.9}, 'Ohio': {2000: 1.5, 2001: 1.7, 2002: 3.6}}
frame3 = pd.DataFrame(pop)
frame3

Then it shows the output as this:

      Nevada  Ohio
2000     NaN   1.5
2001     2.4   1.7
2002     2.9   3.6

But this is working off old behavior. In Python 3.6+, dicts are ordered by insertion order and that order is preserved when instantiating DataFrames from dicts.. So running the first block above now results in this output instead:

      Nevada  Ohio
2001     2.4   1.7
2002     2.9   3.6
2000     NaN   1.5

Some of the accompanying text should be changed to reflect this, for example on page 135 it says:

The keys in the inner dicts are combined and sorted to form the index in the result. This isn't true if an explicit index is specified:

wesm commented 3 years ago

Thanks. This will be fixed in the 3rd edition