rachelss / SISRS

Site Identification from Short Read Sequences.
24 stars 15 forks source link

Python Port Documentation #55

Open BobLiterman opened 6 years ago

BobLiterman commented 6 years ago

Hi @anderspitman,

After bringing over the new Python port branch, there are a few things that I'm unclear about. Would you mind creating a basic README with usage instructions (how to call new SISRS, etc)? I got some errors on my system right off the bat so I may just missing something basic.

There are lots of scripts wrapped up in different subfolders and I'm not sure what's being called, what's leftover, what's Docker related, etc. It would be much easier for me to troubleshoot the most basic version of the code (non-Dockerized) with only the scripts that are called during a normal SISRS run. Once all that code works when called from the non-Docker environment, creating a parallel Docker version should be fairly straightforward.

Also, it may be a good idea to have a Docker version and a non-Docker version (perhaps on separate branches), as the required programs/Python packages to run SISRS are not altogether rare for a phylogenomicist to have preinstalled on their system. Docker is great for some, but for others that like to manage their machine in a bit more hands-on fashion, preinstalling the fairly standard-issue requirements may be preferable to a large download that installs a lot of things they may already have.

The new stuff looks slick, and I'm sure we can get everything incorporated ASAP.

Bob

anderspitman commented 6 years ago

Yeah I can definitely put up a README. We should probably have a discussion about the user interface first to make sure it's what we want, since that's probably the biggest user-facing difference currently with the python version. I'll do a little mini README in a comment here to get the conversation started.

Installation

sisrs can be installed using setupstools (and therefor eventually pip):

python setup.py install

This will installation the dependencies and make sisrs and sisrs-python available on your path.

Running

Basically, to keep things simple up front, I'm using a CLI library called click. It's great because it handles all the command line argument parsing shenanigans, as well as auto-generating a help message. However, it also imposes a rigid structure on how the program is called. I think this is a good thing, but it is different from the way sisrs used to be invoked. We need to decide if we are happy with the new interface or want to modify it and/or exactly clone the old interface. Here are examples:

sisrs-python -p 20 -a premade -c 0 align_contigs
sisrs-python -p 20 -a premade -c 0 identify_fixed_sites
sisrs-python -f DATA_DIR -z OUTPUT_DIR output_alignment
sisrs-python change_missing -m 6

The biggest differences from the old way is that options which apply to all commands (ie -p) need to come before the command name, whereas options that apply to a specific command (or commands) need to come after the command. Also, I change the names from camelCase to something more pythonic, but this was purely a preference on my part and I would not be opposed to using the old names.

anderspitman commented 6 years ago

To address some of your other concerns, based on what we've decided, I think bin/ and libexec/ are going to go away entirely. I've been porting over the scripts in libexec/ and putting them under sisrs/, which is the python package directory. The docker/ directory is around purely as an optional tool and for testing and will not affect users at all. As long as they have the depencies installed, they should be able to just run python setup.py install and then start using it, whether inside docker or not it doesn't matter.

anderspitman commented 6 years ago

I'm actually in the middle of moving everything I've done so far over to Python 3. Once that's complete I'll remove bin/ and libexec/ to clean things up a bit. I'm hoping to be done with that today or tomorrow.

BobLiterman commented 6 years ago

Great. I think in light of that if there are no exceptions, I will hold off testing until the Python 3 conversion. That way I can continue with my stuff and just do one round of testing. Sound good @rachelss?