alchemistry / alchemlyb

the simple alchemistry library
https://alchemlyb.readthedocs.io
BSD 3-Clause "New" or "Revised" License
191 stars 49 forks source link

[CI] account for changed pandas.concat sorting behavior #201

Closed xiki-tempula closed 2 years ago

xiki-tempula commented 2 years ago

Fix #200 It seems that pandas 1.4.3 has some issue.

codecov[bot] commented 2 years ago

Codecov Report

Merging #201 (a651b3e) into master (fd08e1d) will increase coverage by 0.15%. The diff coverage is 100.00%.

@@            Coverage Diff             @@
##           master     #201      +/-   ##
==========================================
+ Coverage   98.04%   98.20%   +0.15%     
==========================================
  Files          24       24              
  Lines        1333     1340       +7     
  Branches      281      290       +9     
==========================================
+ Hits         1307     1316       +9     
  Misses          5        5              
+ Partials       21       19       -2     
Impacted Files Coverage Δ
src/alchemlyb/parsing/namd.py 99.17% <100.00%> (+<0.01%) :arrow_up:
src/alchemlyb/parsing/__init__.py 100.00% <0.00%> (ø)
src/alchemlyb/parsing/amber.py 97.48% <0.00%> (+0.02%) :arrow_up:
src/alchemlyb/estimators/mbar_.py 98.24% <0.00%> (+0.03%) :arrow_up:
src/alchemlyb/parsing/gmx.py 98.18% <0.00%> (+0.60%) :arrow_up:
src/alchemlyb/parsing/gomc.py 93.20% <0.00%> (+1.12%) :arrow_up:

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update fd08e1d...a651b3e. Read the comment docs.

orbeckst commented 2 years ago

Do we know what the issue with Pandas is? Is 1.4.3 the latest version and does, say, 1.4.4 work? Basically I am wondering if this is has been wrong the whole time and is fixed now and we need to change the test or if this is a temporary issue?

If it's temporary then we also need to change the pip and condas stuff so that version of pandas is never installed.

xiki-tempula commented 2 years ago

@orbeckst Sorry for the delay, I will investigate this issue in the weekends.

orbeckst commented 2 years ago

Sure, no problem, thanks for donating your time!!!!

dotsdl commented 2 years ago

Investigating this now. Looks like in pandas 1.4.3 the columns of our u_nk dataframes are being sorted from smallest to highest lambda value. Digging into if this is intentional behavior in latest pandas.

dotsdl commented 2 years ago

Behavior of this test prior to pandas 1.4.3 gives a u_nk that looks like:

u_nk
                    1.0       0.9  0.8  0.7  0.6  0.5  0.4  0.3  0.2       0.1  0.0
time    fep-lambda                                                                 
4000.0  1.0         0.0  0.101483  NaN  NaN  NaN  NaN  NaN  NaN  NaN       NaN  NaN
4010.0  1.0         0.0  0.396537  NaN  NaN  NaN  NaN  NaN  NaN  NaN       NaN  NaN
4020.0  1.0         0.0  0.813706  NaN  NaN  NaN  NaN  NaN  NaN  NaN       NaN  NaN
4030.0  1.0         0.0  0.533748  NaN  NaN  NaN  NaN  NaN  NaN  NaN       NaN  NaN
4040.0  1.0         0.0 -0.238862  NaN  NaN  NaN  NaN  NaN  NaN  NaN       NaN  NaN
...                 ...       ...  ...  ...  ...  ...  ...  ...  ...       ...  ...
49960.0 0.0         NaN       NaN  NaN  NaN  NaN  NaN  NaN  NaN  NaN -2.880429  0.0
49970.0 0.0         NaN       NaN  NaN  NaN  NaN  NaN  NaN  NaN  NaN -2.901732  0.0
49980.0 0.0         NaN       NaN  NaN  NaN  NaN  NaN  NaN  NaN  NaN -3.063265  0.0
49990.0 0.0         NaN       NaN  NaN  NaN  NaN  NaN  NaN  NaN  NaN -2.727282  0.0
50000.0 0.0         NaN       NaN  NaN  NaN  NaN  NaN  NaN  NaN  NaN -1.969434  0.0

[30170 rows x 11 columns]

which gives for the resulting BAR estimator's delta_f_:

           1.0       0.9       0.8       0.7       0.6       0.5       0.4        0.3        0.2       0.1       0.0
1.0   0.000000 -1.296011 -2.914735 -4.520049 -5.732406 -6.495677 -8.836413 -10.094600 -10.228049 -8.583977 -4.184060
0.9   1.296011  0.000000 -1.618724 -3.224038 -4.436395 -5.199666 -7.540402  -8.798590  -8.932038 -7.287966 -2.888049
0.8   2.914735  1.618724  0.000000 -1.605314 -2.817671 -3.580942 -5.921678  -7.179866  -7.313314 -5.669242 -1.269325
0.7   4.520049  3.224038  1.605314  0.000000 -1.212357 -1.975628 -4.316364  -5.574552  -5.708000 -4.063928  0.335989
0.6   5.732406  4.436395  2.817671  1.212357  0.000000 -0.763271 -3.104007  -4.362195  -4.495643 -2.851571  1.548346
0.5   6.495677  5.199666  3.580942  1.975628  0.763271  0.000000 -2.340736  -3.598924  -3.732372 -2.088300  2.311617
0.4   8.836413  7.540402  5.921678  4.316364  3.104007  2.340736  0.000000  -1.258188  -1.391636  0.252436  4.652353
0.3  10.094600  8.798590  7.179866  5.574552  4.362195  3.598924  1.258188   0.000000  -0.133449  1.510624  5.910541
0.2  10.228049  8.932038  7.313314  5.708000  4.495643  3.732372  1.391636   0.133449   0.000000  1.644072  6.043989
0.1   8.583977  7.287966  5.669242  4.063928  2.851571  2.088300 -0.252436  -1.510624  -1.644072  0.000000  4.399917
0.0   4.184060  2.888049  1.269325 -0.335989 -1.548346 -2.311617 -4.652353  -5.910541  -6.043989 -4.399917  0.000000

This means that using delta_f_.iloc[0, -1] as we do for each test (and generally advise users) here gives the free energy difference going from lambda = 1 to lambda = 0.

With pandas 1.4.3, a fix to pd.concat's sort parameter was included, which changes the behavior of this line in the namd u_nk parser. We now get for u_nk:

                    0.0       0.1  0.2  0.3  0.4  0.5  0.6  0.7  0.8       0.9  1.0
time    fep-lambda                                                                 
4000.0  1.0         NaN       NaN  NaN  NaN  NaN  NaN  NaN  NaN  NaN  0.101483  0.0
4010.0  1.0         NaN       NaN  NaN  NaN  NaN  NaN  NaN  NaN  NaN  0.396537  0.0
4020.0  1.0         NaN       NaN  NaN  NaN  NaN  NaN  NaN  NaN  NaN  0.813706  0.0
4030.0  1.0         NaN       NaN  NaN  NaN  NaN  NaN  NaN  NaN  NaN  0.533748  0.0
4040.0  1.0         NaN       NaN  NaN  NaN  NaN  NaN  NaN  NaN  NaN -0.238862  0.0
...                 ...       ...  ...  ...  ...  ...  ...  ...  ...       ...  ...
49960.0 0.0         0.0 -2.880429  NaN  NaN  NaN  NaN  NaN  NaN  NaN       NaN  NaN
49970.0 0.0         0.0 -2.901732  NaN  NaN  NaN  NaN  NaN  NaN  NaN       NaN  NaN
49980.0 0.0         0.0 -3.063265  NaN  NaN  NaN  NaN  NaN  NaN  NaN       NaN  NaN
49990.0 0.0         0.0 -2.727282  NaN  NaN  NaN  NaN  NaN  NaN  NaN       NaN  NaN
50000.0 0.0         0.0 -1.969434  NaN  NaN  NaN  NaN  NaN  NaN  NaN       NaN  NaN

[30170 rows x 11 columns]

which features sorted columns. The resulting BAR estimator's delta_f_ then gives:

          0.0       0.1        0.2        0.3       0.4       0.5       0.6       0.7       0.8       0.9        1.0
0.0  0.000000 -4.399917  -6.043989  -5.910541 -4.652353 -2.311617 -1.548346 -0.335989  1.269325  2.888049   4.184060
0.1  4.399917  0.000000  -1.644072  -1.510624 -0.252436  2.088300  2.851571  4.063928  5.669242  7.287966   8.583977
0.2  6.043989  1.644072   0.000000   0.133449  1.391636  3.732372  4.495643  5.708000  7.313314  8.932038  10.228049
0.3  5.910541  1.510624  -0.133449   0.000000  1.258188  3.598924  4.362195  5.574552  7.179866  8.798590  10.094600
0.4  4.652353  0.252436  -1.391636  -1.258188  0.000000  2.340736  3.104007  4.316364  5.921678  7.540402   8.836413
0.5  2.311617 -2.088300  -3.732372  -3.598924 -2.340736  0.000000  0.763271  1.975628  3.580942  5.199666   6.495677
0.6  1.548346 -2.851571  -4.495643  -4.362195 -3.104007 -0.763271  0.000000  1.212357  2.817671  4.436395   5.732406
0.7  0.335989 -4.063928  -5.708000  -5.574552 -4.316364 -1.975628 -1.212357  0.000000  1.605314  3.224038   4.520049
0.8 -1.269325 -5.669242  -7.313314  -7.179866 -5.921678 -3.580942 -2.817671 -1.605314  0.000000  1.618724   2.914735
0.9 -2.888049 -7.287966  -8.932038  -8.798590 -7.540402 -5.199666 -4.436395 -3.224038 -1.618724  0.000000   1.296011
1.0 -4.184060 -8.583977 -10.228049 -10.094600 -8.836413 -6.495677 -5.732406 -4.520049 -2.914735 -1.296011   0.000000

So calling delta_f_.iloc[0, -1] gives the negative of the previous result, since we are now getting the free energy difference going from lambda = 0 to lambda = 1.

dotsdl commented 2 years ago

As for what the fix should be, I'm not exactly sure. We generally do use delta_f_.iloc[0,-1] to access the top-right element of the free energy difference dataframe from an estimator, which in many cases corresponds to the difference between the state in which all lambdas = 0 and all lambdas = 1. However, I don't believe we've ever made this guarantee across the library.

dotsdl commented 2 years ago

Just checked the docs, and looks like I was wrong: we're advising users there to use delta_f_.loc and explicitly specifying lambda values for [from, to]. This makes our choice of sorting vs. not sorting less important here. Thoughts?

dotsdl commented 2 years ago

Restoring previous behavior for the namd parser involves switching to sort=False in this line, if that's what we'd like to do. I don't know if there are other repercussions to this choice.

orbeckst commented 2 years ago

Thank you for getting to the bottom of the problem.

Does https://github.com/pandas-dev/pandas/pull/47206 restore the previous behavior? If so, we can just blacklist all versions of pandas with the changed sorting behavior. That would restore our status-quo and avoid breakage of existing scripts.

Should we use .loc in the tests then?

For the deeper question about guaranteeing sorting orders etc we should raise a separate issue. I am happy, though, that we're not giving wrong advice in our docs.

dotsdl commented 2 years ago

Thank you for getting to the bottom of the problem.

Does pandas-dev/pandas#47206 restore the previous behavior? If so, we can just blacklist all versions of pandas with the changed sorting behavior. That would restore our status-quo and avoid breakage of existing scripts.

My understanding is that this PR on pandas fixes what was broken sorting behavior, and that our call to sort=True never worked. But now it does work, so now we end up with a sorted-column-names u_nk from that parser where we didn't have one before.

Should we use .loc in the tests then?

Yes, probably. I can make this change.

For the deeper question about guaranteeing sorting orders etc we should raise a separate issue. I am happy, though, that we're not giving wrong advice in our docs.

Would you mind opening this? I'll finish up this and #197.

dotsdl commented 2 years ago

@orbeckst if satisfied, please merge. Thanks!