genome / bam-readcount

Count bases in BAM/CRAM files
MIT License
298 stars 95 forks source link

[JOSS REVIEW] Example usage #80

Closed friedue closed 2 years ago

friedue commented 3 years ago

Related to: https://github.com/openjournals/joss-reviews/issues/3722

Example usage: Do the authors include examples of how to use the software (ideally to solve real-world analysis problems).

As mentioned in the other issue #79, running the test data set was a bit of a pain and I didn't see it encouraged or documented. It'd be great if you could add that, either to the README or some other form of documentation that is linked to in the README.

The output file is a bit overwhelming, so maybe adding a brief example of how to parse it would be helpful. I found [this tutorial about calculating tumor VAFs](https://pmbio.org/module-05-somatic/0005/05/01/Somatic_LOH_Calling/ where you show how to calculate tumor VAFs ) that seemed to do exactly that, so linking to that could be the minimalist solution, adding a .Rmd/HTML to the bam-readcount repo might be even nicer to help people understand more quickly what the different scores represent.

Generally, I feel that it'd be useful to add a more extended blurb on typical use-cases for bam-readcount. I understand it's already part of more extensive pipelines, but since you're publishing it on its own merits here, it'd be great if new users could immediately grasp why they should start using bam-readcount (apart from performance arguments). I personally love visuals, but I understand that's not everyone's preference.

Last suggestion (totally optional): the paper on SARS-CoV2 mutations that you cite sounds like it could provide a great example use case; if the authors would be willing to share their R and python code or to just generate a brief walk-through of the part that uses bam-readcount and how it helped them shape their conclusions, that'd be fantastic.

bebatut commented 2 years ago

I totally agree with @friedue.

It would be also interesting to

chrisamiller commented 2 years ago

This was the biggest thing remaining, and @apldx addressed it with the merge he made yesterday, which adds an extensive (and relevant!) tutorial: https://github.com/genome/bam-readcount/tree/master/tutorial Marking this done.

friedue commented 2 years ago

This is a great and timely tutorial! It ticks all the boxes -- it show cases bam-readcount, it demonstrates what its place could be in typical analysis pipelines and it features some hands-on, highly relevant and scientifically interesting results. Well worth the wait!

chrisamiller commented 2 years ago

Done!

friedue commented 2 years ago

You could really be capitalizing on the omicron-link to get people to use bam-readcount and familiarize themselves with it without noticing (because they'll be so eager to find out more about omicron). But that's your call and it's totally fine the way you've done it now. Shout-out to @apldx for a job really well done!

apldx commented 2 years ago

Thanks! It was really fun to work on. Seconding @chrisamiller, really appreciate all the suggestions.