chasewnelson / SNPGenie

Program for estimating πN/πS, dN/dS, and other diversity measures from next-generation sequencing data
GNU General Public License v3.0
106 stars 37 forks source link

Batch processing #25

Closed mhopken closed 4 years ago

mhopken commented 4 years ago

Hello,

I am trying to use SNPgenie in a for loop to batch process. However it is unclear how I change output directory. The software processes the first sample, creates the standard output directory then throws an error for the rest for the samples because the output folder already exists. Any help would be appreciated. Thank you, Matt

singing-scientist commented 4 years ago

Greetings, Matt! Thanks for using SNPGenie. At the time I wrote SNPGenie, I used it on a case-by-case basis for viruses and did not have the foresight to include much flexibility — apologies! As it currently operates, SNPGenie is meant to be executed only once in a given directory, and that directory is to contain the FASTA, GTF, and SNP report(s). Thus, a quick fix for parallelization would be to place each SNP report in a separate directory, then batch. Alternatively, I don't imagine it would be very difficult to update the code to place results in a user-specified directory. I could do this in ~1 week, if it would still be of use to you. Please let me know.

Yours, Chase

mhopken commented 4 years ago

Thank you for your quick reply! It would be of use to me for sure. As for the quick fix, I have my .vcf and .fasta for each sample in individual directories but when I run from the directory that has all of the sample folders it still puts the output in the current directory, not where the sample files are. So batch still does not work. I have to do each one individually. I appreciate the quick response!

singing-scientist commented 4 years ago

Thanks @mhopken for waiting on me. For you to batch in this 'quick fix' way, your parent script will have to actually enter each subdirectory before calling SNPgenie, which would allow a SNPGenie_Results directory to be created in each subdir — otherwise SNPGenie_Results will be written in the overarching parent directory, and fail after the first subdir is processed.

Nevertheless, I hope to code this change for you in the next few days! Thanks for your patience.

singing-scientist commented 4 years ago

Dear @mhopken: I have added the requested ability, to be specified using the --outdir option. Some examples are given if you call snpgenie.pl --help. Please download the new script, give it a try, and let me know if it works well for you. If so, I will add these options to the official documentation.

mhopken commented 4 years ago

Hello Chase,

Thank you very much for working on this! I will try it out and let you know. Much appreciated. Matt

singing-scientist commented 4 years ago

Dear @mhopken: has this issue been resolved?

mhopken commented 4 years ago

@cwnelson88 yes it is fixed! I finally got a chance to check the batch processing. Thank you for all of you help! Cheers

singing-scientist commented 4 years ago

Great to hear that, @mhopken! I will now close the issue; please feel free to open another if you have any more feedback!