rcsb / mmtf-cpp

The pure C++ implementation of the MMTF API, decoder and encoder.
MIT License
21 stars 24 forks source link

Add python bindings #35

Open danpf opened 4 years ago

danpf commented 4 years ago

this may not ever be merged, but here I recreate the basics of mmtf-py using this library only.

it is currently significantly faster than the mmtf-py implementation.

time to load a single mmtf file 1000x
cpp bare 0.29s
this library 0.44s
python og 4.34s

the only 'question' that is really left to solve is how to handle the templating of the mapdecoder. But I think that will not be too difficult. Once i get some time to look at it.

not to get ahead of myself -- but TODO:

danpf commented 4 years ago

@gtauriello thanks for your comments! I think it will be a relatively minor addition to the library when it's all said and done, hopefully totaling in at under 1000 lines (of course there's a lot of python init boilerplate that baloons that), but it's all quite simple.

danpf commented 4 years ago

@gtauriello and @speleo3 could you review this when you get a chance? I think this is ready for a 1-4 week trial run in master, and then once we are happy with it I can publish it to pip.

what are your thoughts?

danpf commented 4 years ago

@speleo3 Thank you for taking the time to review! I'll split this out into 3 requests and then I can address your concerns

gtauriello commented 4 years ago

Thanks to both of you for your efforts here and sorry if I've been quiet lately. At SWISS-MODEL, we are currently involved in some modeling efforts for SARS-CoV-2 which makes it hard for me to look at this code. That being said, I'll try to contribute what I can...

danpf commented 4 years ago

Can you add a README for the Python usage? It's not obvious what the suggested usage pattern is, the tests are the only hints.

done

The wrapping looks tedious and very explicit, I assume there is no way to do it automated? I hope this complexity is no burden for the long-term maintainability of mmtf-cpp.

I actually did use https://github.com/RosettaCommons/binder to generate the initial bindings.

but in this case binder is not a great fit because

  1. msgpack complicates things due to its variant typing system
  2. binder bindings were slower because they did not destroy the c++ data, they returned copies/lists.

it was actually very easy to do most of the bindings. char related data was the only difficult thing to use.

danpf commented 4 years ago

reminder bump! (if you're still busy w/ covid stuff don't feel obligated to respond) just making sure this isn't completely forgotten (like i just did for a month)

danpf commented 3 years ago

Hi all, sorry to bump again, but I also forgot about this (and the other PRs) for a few months. Hopefully everyone is healthy and doing well!

gtauriello commented 3 years ago

Ciao all. I am so sorry that I left this pending for so long. I now merged the other 2 PRs and did a 3rd one for the segfault issue (as @speleo3 had suggested).

Also: honestly, I simply don't find time for this project any more. Can one of you @danpf or @speleo3 take it over? I think I can pass admin rights to one (or both) of you for this...

@danpf for the code itself: I think all 5 TODO items at the start of this PR are done, right? Once the other PR is merged (I need a review there), you will have to merge master into this branch to resolve merge conflicts. Other than that it seems to work just fine...

Comments on the README:

gtauriello commented 3 years ago

@danpf One more thing for after the PR and master merges. Can you add an item to the CHANGELOG about the python bindings?