Suggestion: Have --printHaplotypes and --printLikelihoods on by default

ucl-pathgenomics / HaROLD

Haplotype Reconstruction of Longitudinal Deep sequencing data

MIT License

6 stars 1 forks source link

Suggestion: Have --printHaplotypes and --printLikelihoods on by default #9

Open zacherydickson opened 4 months ago

zacherydickson commented 4 months ago

When first interacting with the tool it is not immediately clear that the output files will not be created by default. It took some time to discover that this was the issues when the program appeared to complete, with no output and no errors in the log file. Additionally, the example command provided in the README file (lines 88-91) does not include the options. The following section of the manual then states all files will be created. Looking at the code in java/rag/HAROLD/Cluster_RG.java, it is clear that the haplotype and likelihood files are created only if requested.

It would simplify the user experience if the output files were created by default. Especially as they are required inputs for the refinement step. If there is a use case where the output files are not required, then the inverse options could be provided.

I should note that I am using the V2.0 release.

cristina86cristina commented 4 months ago

Hi there, Thanks for your comment! Can I check with you - you run the command:

java -jar /your-path-to-HaROLD/jar/Cluster_RG/dist/HaROLD-2.0.jar \ --count-file sample.txt --haplotypes 4 --alpha-frac 0.5 --gamma-cache 10000 \ -H -L --threads 4 -p /your-path-to-results/Step1_results

And this does not produce the output files for you? I will check the v2 release because the latest version (which you get when you clone the repo) does that automatically. Thanks for pointing this out and I apologise if that created confusion.

cristina86cristina commented 4 months ago

Hi there, Just checked and v2 release is fine. As you can see in my comment above, our example included the options to print results with -H and -L (the shorter version of --printHaplotypes and --printLikelihoods). I will note your suggestion and make that automatic for future versions of the tool!

zacherydickson commented 4 months ago

I can confirm that when the short or long versions of the printHaplotypes and printLikelihood flags are included in the command, then the output files are created. However when the options are not included, then the output files are not created. I tested both with the files in the V2.0 release, and a clone of the current repo.

My suggestion is simply that output files should be created by default, without the need to include the command line options.

Regardless, I'm looking forward to working with the tool, it is very cool!