FelixKrueger / SNPsplit

Allele-specific alignment sorting
http://felixkrueger.github.io/SNPsplit/
GNU General Public License v3.0
52 stars 20 forks source link

option to specify output directory #13

Closed vivekbhr closed 6 years ago

vivekbhr commented 7 years ago

Dear Felix

Thanks for the useful tool. I am trying to add SNPsplit into an RNAseq pipeline but the lack of option to specify output directory of the allele-sorted files is creating a trouble.. It would be good to add this parameter in SNPsplit..

Also the documentation doesn't mention which file names/prefix/suffix to expect from the output files in case of single/dual hybrid genome preparation.. I figured this out in my case but would be good to have it in the docs..

Thanks Vivek

FelixKrueger commented 7 years ago

Hi Vivek,

It shouldn't be too difficult to add output directory handling to the SNPsplit, I can try to look into this next week.

I'll also try to to update the documentation so that it is more obvious what you have to expect.

Best, Felix

FelixKrueger commented 7 years ago

Hi Vivek,

I have tried to add an option -o/--output_dir <dir> to SNPsplit (and also to tag2sort) in this commit 7fef3bf6ef131a2e7184b4a165b2761cd312181d. Can you clone the latest development version and let me know if it works in your hands?

vivekbhr commented 7 years ago

sure.. Let me test that and get back to you..

FelixKrueger commented 7 years ago

Right, I had to change how the current working directory is passed on to tag2sort. It should work now, apologies.

vivekbhr commented 7 years ago

If I read file input_dir/file.bam and write to dir output_dir, It tries to create the file output_dir/input_dir/<filename>*SNPsplit.report.txt and fails

FelixKrueger commented 7 years ago

File path handling is hard... I have tried to fix this (new) problem now (here 0f705aad1c5f1ebb3803d8ab22f9938576a2b5ab). Please clone and try again. Cheers, Felix

vivekbhr commented 7 years ago

This time it works..

vivekbhr commented 7 years ago

since snpsplit outputs more than one log files, i think it would be also good to put them in a separate log directory (maybe inside the output dir itself) so that the output dir is a bit cleaner.

FelixKrueger commented 7 years ago

I believe specifying an output dir will also write the reports to that output directory. Does it not?

vivekbhr commented 7 years ago

yes indeed..! I was just suggesting to write to output_dir/log

FelixKrueger commented 7 years ago

Ahh, I see. To be honest I am fan of keeping things simple (and in the same folder), can't you just do

mv *report.txt log/

once the run(s) have completed?

vivekbhr commented 7 years ago

Sure. I would continue this way.. :)

vivekbhr commented 6 years ago

Hey Felix

Just found that while running SNPsplit with --output_dir , tag2sort fails to find the allele_flagged.bam file, since it still expects it to be in the input_dir, while the file is actually created in output_dir. couldn't detect this earlier since SNPsplit finished successfully, but later I found that genome_1 and genome_2 bam files were empty.

Best, Vivek

FelixKrueger commented 6 years ago

Hmm, this should not happen as I have also changed tag2sort in this commit: 7fef3bf6ef131a2e7184b4a165b2761cd312181d. Did you clone the entire repository or just update SNPsplit itself? Can you try updating everything, and if it still fails post the entire command that failed?

vivekbhr commented 6 years ago

I was doing a git pull on the repo that I had cloned earlier. Today I removed and did a fresh clone of the master branch and repeated, got the empty files again.

Here's an example with a test file (129/CAST genome)..

~/programs/SNPsplit/SNPsplit --paired --snp_file ../../snp_genome/all_CAST_EiJ_SNPs_129S1_SvImJ_reference.based_on_GRCm38.txt -o test filtered_bam/MouseIgG_control.filtered.sortedByName.bam

Output reports are attached. MouseIgG_control.filtered.sortedByName.SNPsplit_report.txt MouseIgG_control.filtered.sortedByName.SNPsplit_sort.txt

And this was the error:

[E::hts_open_format] fail to open file 'MouseIgG_control.filtered.sortedByName.allele_flagged.bam'                                                                        
samtools view: failed to open "MouseIgG_control.filtered.sortedByName.allele_flagged.bam" for reading: No such file or directory 
FelixKrueger commented 6 years ago

Hi Vivek,

Thanks for the details. I am currently still travelling but I will take a look at this next week. Cheers, Felix

FelixKrueger commented 6 years ago

Hi Vivek,

it turns out that it was working only in single-end mode, but have now added it for paired-end and Hi-C mode as well. Sorry for that, please clone and try again. Felix

vivekbhr commented 6 years ago

Seems to be working.. Thanks :)