jensengroup / propka

PROPKA predicts the pKa values of ionizable groups in proteins and protein-ligand complexes based in the 3D structure.
http://propka.org
GNU Lesser General Public License v2.1
263 stars 58 forks source link

Allows for file-like objects to be passed to read_molecule_file #84

Closed IAlibay closed 4 years ago

IAlibay commented 4 years ago

Fixes #83

Changes:

Question:

codecov[bot] commented 4 years ago

Codecov Report

Merging #84 into master will increase coverage by 0.16%. The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master      #84      +/-   ##
==========================================
+ Coverage   70.24%   70.41%   +0.16%     
==========================================
  Files          25       25              
  Lines        4137     4140       +3     
==========================================
+ Hits         2906     2915       +9     
+ Misses       1231     1225       -6     
Impacted Files Coverage Δ
propka/input.py 62.87% <100.00%> (+4.33%) :arrow_up:

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update 01debbf...81e51c3. Read the comment docs.

orbeckst commented 4 years ago

Should the extra attribute checks for the file-like objects removed in #40 be added back?

I don't remember why I added all of them way-back-when – probably related to MDAnalysis.lib.util.isstream. It is possible that I wasn't quite sure what was needed to ultimately make it work. If your streamlined version does the job then I think it's fine. I would make sure the StringIO works.

(I wouldn't bother with streams from the network (via urllib2 or request) although it could be fun :-).)

IAlibay commented 4 years ago

Should the extra attribute checks for the file-like objects removed in #40 be added back?

I don't remember why I added all of them way-back-when – probably related to MDAnalysis.lib.util.isstream. It is possible that I wasn't quite sure what was needed to ultimately make it work. If your streamlined version does the job then I think it's fine. I would make sure the StringIO works.

(I wouldn't bother with streams from the network (via urllib2 or request) although it could be fun :-).)

As it is now, it's probably not the safest way to ensure the object will be read, but it should work for normal use cases.

In some cases e.g. BufferedIOBase derived objects, might not work (i.e. if object.seekable() returns False). Although I'm not really sure of a case where this would be useful / worth handling.

orbeckst commented 4 years ago

Very nice, thank you @IAlibay .

@sobolevnrm / @speleo3 do you want to have a quick look at the PR? (Disclosure: I am biased, I want the feature to work again as soon as possible...)

speleo3 commented 4 years ago

Looks pretty good to me, but I have one comment: I think the argument logic of read_molecule_file can be streamlined. Instead of having the filename either in the first or the third argument, and allowing the first argument to be either a path or a stream, I suggest this API:

def read_molecule_file(filename: str, mol_container, stream=None):
    """
    Args:
        filename (str): input file name
        mol_container:  MolecularContainer object
        stream: optional input file handle. If None, then open `filename` as a local file for reading.
    """

What do you think?

sobolevnrm commented 4 years ago

Very nice, thank you @IAlibay .

@sobolevnrm / @speleo3 do you want to have a quick look at the PR? (Disclosure: I am biased, I want the feature to work again as soon as possible...)

I probably won't be able to look at it until tonight. However, I'm confident in the assessment of you and @speleo3. Thanks!

IAlibay commented 4 years ago

Looks pretty good to me, but I have one comment: I think the argument logic of read_molecule_file can be streamlined. Instead of having the filename either in the first or the third argument, and allowing the first argument to be either a path or a stream, I suggest this API:

def read_molecule_file(filename: str, mol_container, stream=None):
    """
    Args:
        filename (str): input file name
        mol_container:  MolecularContainer object
        stream: optional input file handle. If None, then open `filename` as a local file for reading.
    """

What do you think?

That is a lot cleaner :) It makes the try/except block a lot nicer.

sobolevnrm commented 4 years ago

This looks awesome -- thank you, all!