mmagnus / rna-tools

🔧rna-tools: a toolbox to analyze sequences, structures and simulations of RNA (and more) used by RNA CASP, RNA PUZZLES, and me ;-) docs @ http://rna-tools.rtfd.io web @ http://rna-tools.online
http://rna-tools.online
GNU General Public License v3.0
152 stars 43 forks source link

Make it possible for RNAStructure to get initialized with a file #129

Open valentin994 opened 2 years ago

valentin994 commented 2 years ago

Hi, I'm working on an API where I need to get the sequence out of a .pdb file. The way the code base is set up now you need to have a local copy of the file, which would mean that you have to upload it twice in an API environment. First, the user uploads it to the server and then I would need to save it locally so I can provide a path to the RNAStructure class. I took out the parsing part of the code and implemented it, but if you would like I could create a PR for supporting this functionality and maybe a bit of a cleanup.

mmagnus commented 2 years ago

dear @valentin994 it's great that you found the code useful. Let me know if you need any help.

And yeah, let me see what you have so we can improve the package here.

valentin994 commented 2 years ago

In the end, I ended up creating a parser, it might be useful here too, or biopython so let me know what you think.

So the problem I stumbled upon when using RnaStructure().get_sequence is that I wouldn't always get the sequence expected (I can't really explain in biological terms as I'm a developer but I'll run through examples that might give you insight onto this).

For example these pdb files "2l8h", "6b14", "6las" the output from get_sequence() would be:

While I would expect:

The expected sequences is what you can get if you get the fasta version of the files mentioned and you pull out the sequence with SeqIo parser. Now I'm not sure if the sequences I get from this package are the expected behaviour, or if there is any need to be able to parse them out like I do now. If you find it useful I can create a PR or show you more in depth how it would look. I hope that I managed to explain it well enough 😓

valentin994 commented 2 years ago

Oh, I sidetracked from the original question. But yeah initializing with bytes can also be done.