chemosim-lab / ProLIF

Interaction Fingerprints for protein-ligand complexes and more
https://prolif.readthedocs.io
Apache License 2.0
361 stars 68 forks source link

fp.to_dataframe(return_atoms=True,drop_empty=True) still slow #112

Closed ReneHamburger1993 closed 1 year ago

ReneHamburger1993 commented 1 year ago

Hi,

I have 25k frames that I analyzed and it took ~5min to get the full atomid DataFrame with my selections.

I traced it down to the grouping including the "empty" columns.

If we remove them before grouping it gets much faster.

1000 frames: 10s -> ~2s 25k frames: ~300s -> 60s

Though its a factor of 5 faster now. I will open a PR with my optimization.

Kind regards René

cbouy commented 1 year ago

Hi René and thanks a lot for reporting this and contributing, I'll try to have a look at your PR this week.

Best, Cédric