pyahmed / PyCORN

A script to extract data from ÄKTA/UNICORN result-files (.res)
47 stars 26 forks source link

Allow `import pycorn` #6

Closed wackywendell closed 9 years ago

wackywendell commented 9 years ago

Thank you so much for making this library! It makes extracting data from those files so much easier.

The one thing I'd like is to be able to use it from other Python scripts, so I could do something like

import pycorn

data = pycorn.load(filename)

[use data somehow]

Do you mind if I start working on this and submit a pull request to you?

Thanks again for all your work on this!

wackywendell commented 9 years ago

I did some work on this in my importable branch.

There are some definite changes on that branch, but all the original behavior is retained. In other words, if you run pycorn.py -p -e myfile.res with my branch, it gives (or should give) you exactly the same files. These changes do add a couple things:

  1. It moves all functions into a class, which allows you to read multiple files without requiring all of them to use the same global variables (such as inj_sel or file_name)
  2. It puts main() behind the standard if __name__ == '__main__': construct, so that main() isn't called if you use import pycorn in another script, and now you can use import pycorn in other scripts:

    import pycorn
    
    data = pycorn.Pycorn("filename.res")
    data.load()
  3. Based on point 1, it now accepts multiple files on the command line: pycorn.py -p -e myfile.res myfile2.res is now accepted, or pycorn.py -pe *.res (and will plot and extract data from both files, in the same way as if it had been called with pycorn.py -pe myfile.res ; pycorn.py -pe myfile2.res).

The comparison makes it look like I changed everything, but really its all your code, I just moved it inside a class.

Anyways, those are some big changes, but I'd love to hear what you think!

I'm also thinking of splitting up the plotter() functionality, so that one could use this module from inside IPython and, for example, put two graphs in the same figure. Of course, I'd plan on maintaining the original plotter() function, but maybe have it call out to another function for plotting the fractions, for example, so that one could use only portions of it if one wanted.

What do you think? I'd love to hear your thoughts before I open a formal pull request.

And thanks again for putting this out there! I'm so glad I didn't have to worry about reverse-engineering these files myself, that seems like it must have been pretty challenging!

pyahmed commented 9 years ago

Thank you for your comments. Regarding it being importable, I'm still learning python - didn't really see a way how to do it but at the same time still working as an "argparse"-style CLI script.

The whole if name == main never made sense to me - but now it does :-) The concept of classes is still a bit foreign to me but it definitely adds some features (that have also been requested before).

I'm also thinking of splitting up the plotter() functionality, so that one could use this module from inside IPython and, for example, put two graphs in the same figure.

There are a couple (not sure how realistic) of things I'd like to retain:

I'm not opposed to IPython, just that core functionality should be available everywhere :-)

but maybe have it call out to another function for plotting the fractions, for example, so that one could use only portions of it if one wanted.

Do you mean plot fractions in a specific range?

What do you think? I'd love to hear your thoughts before I open a formal pull request. Well this really cool but keep in mind I'm still learning Python/programming so I'm not the expert on what application design/layout is good. I'll look over the code today and see what I understand ;-)

If you have time, maybe you can look at issue #5 (lot of requests for this). How would you solve/approach this?

I'm so glad I didn't have to worry about reverse-engineering these files myself, that seems like it must have been pretty challenging!

That was quite an experience, thinking in hex. I've tried to document the file format as much as I could, there is still a couple of things I haven't added but once I finish this up - I'll add it to the docs.

wackywendell commented 9 years ago

Thanks for your comments! I'll keep working on this. A couple things to add:

There are a couple (not sure how realistic) of things I'd like to retain:

  • Keep compatibility with python2&3
  • Keep dependency to external libraries to a minimum
  • Not require or somehow make it dependent on IPython

I don't think that's hard to maintain at all! I'll work on a new branch that addresses #5, and address it in that thread.

Guillawme commented 9 years ago

Thanks for your comments! I'll keep working on this. A couple things to add:

There are a couple (not sure how realistic) of things I'd like to retain:

  • Keep compatibility with python2&3
  • Keep dependency to external libraries to a minimum
  • Not require or somehow make it dependent on IPython

I don't think that's hard to maintain at all!

Good to know it wouldn't be hard to maintain, because I also think these are essential properties of the tool.

pyahmed commented 9 years ago

@wackywendell - I had a look at your fork, this looks really good. I've been thinking (slowly..) about the structure of the script. Would it be better to reduce the current pycorn-script to just extracting/reading the data and making it available? Other script/s would then do the csv-writing or plotting, like you have already started with the example uvplotter (which also has cli-interface :+1: ).

The reading of the data will not change for the current UNICORN files so might as well make it stable/robust and not touch it anymore ;)

wackywendell commented 9 years ago

Thanks for your feedback!

Yes, I like the idea of having scripts in their own separate folder. If it were my project that I were building from the ground up, I wouldn't have the main file do anything - just make it importable, and then have scripts do whatever you wanted them to. Then people arriving at the project could just look at the names of the scripts and see something like fractionplotter.py, uvplotter.py, csvwriter.py and think "oh! write a CSV! that's what I want!" This is your project though, and whatever you like makes sense. If you want to split it out, go ahead! (-:

I made uvplotter.py for my own use - but figured it also made a good example script for others, so... there it is. One thing it doesn't do is plot the fractions at the bottom, which would be nice; I was trying to find a nice way to separate that functionality from the main code and put it in its own method so that the same code could be used both in your script and in mine, but separating that out was rather challenging, so I haven't gotten that working. That, and this is a side-project of a side-project for me, so... time is a bit limited.

By the way, did you also notice that I added a setup.py? If you download my fork and run python setup.py install, then you should be able to import pycorn from Python from any directory. If you want to upload it to PyPI, that should be pretty easy now! Then newcomers can just run pip install pycorn, and it should just work!

Do you want me to submit a pull request for what I have now?

pyahmed commented 9 years ago

Yeah.. I'm not sure what is the 'best' design/structure, so I'll just play around with your current version and try to keep core functionality (reading/decoding the data) in one script:-)

By the way, did you also notice that I added a setup.py?

Yes - PyPI will come, at some point ;-)

Do you want me to submit a pull request for what I have now?

Yes go ahead. I'll go from there. I might have some questions regarding the class-structure. For example in uvplotter.py you have

UVx, UV = np.array(pc['UV']['data']).T

what exactly does the ".T"-part do/mean?

Guillawme commented 9 years ago

Nice to see things are moving here. :-) I also think it is a good idea to keep a main script that does only one thing: read the data and make it available to other scripts. This way, other people could propose scripts to plot whatever they find useful (or even to compare data from several files, etc.) without having to study the file structure: just recover the data by calling the main script.

pyahmed commented 9 years ago

Master branch now uses pycorn class object, though I made a couple of changes. PyCORN is now available from PyPI (pip install pycorn). After installation, use it in your script (docs will be updated soon) or use the pycorn-bin.py script. This re-implements most of the features of the original script but now also does 2nd/3rd y-axis.

Thanks again @wackywendell for converting it to a class-based structure.