merenlab / anvio

An analysis and visualization platform for 'omics data
http://merenlab.org/software/anvio
GNU General Public License v3.0
415 stars 142 forks source link

Reformat BAM files created prior to `anvi-script-reformat-fasta` #2165

Closed telatin closed 8 months ago

telatin commented 8 months ago

Ciao pips! I recently joined the Anvi'O fanclub thanks to EBAME 8 :)

When a user generated an assembly and backmapped reads against it, but then want to import everything on Anvi'o, it can be the case that the contigs will contain names unwanted by Anvi'o.

anvi-script-reformat-fasta can:

This PR proposes to add a new anvi-script-reformat-bam that takes as input: 1) A BAM file generated against the initial assembly 2) The reformat-report produced by anvi-script-reformat-fasta

And produces a: 1) New BAM file with converted references

[For a small test dataset and documentation see telatin/anvi-script-reformat-bams]

📖 I tried drafting a md documentation page, I would appreciate help there, thanks!

meren commented 8 months ago

Dear @telatin, thank you very much for this PR! Everything looks great apart from a few minor issues I see with the docs (such as a %(contigs-fasta) instead of %(contigs-fasta)s) :) Please run the following command in your branch to make sure you don't see any errors (you can run it before fixing anything so hopefully this program can help you recognize the issue conveniently):

anvi-script-gen-help-pages -o TEST

(this is the exact command anvio.org will run to update the online docs :))

Best wishes,

telatin commented 8 months ago

Thanks for the tip, I got the anvi-script-gen-help-pages working.

The proposed artifact name is rename-report-file, but you will surely come up with something more consistent. Next week we can have a further look at the docs (with @ivagljiva too :) )

meren commented 8 months ago

Thank you very much, @telatin :) It is now in the main branch, and the docs at https://anvio.org/help/main/ will sync soon.

meren commented 8 months ago

Hey @telatin, I've made a few changes at #2168. I tried to explain my reasoning behind those changes in each commit (it is much more useful to look at them one by one than looking at the overall changes all at once).

I hope they're agreeable to you.

meren commented 8 months ago

Apologies for sending things one by one. There is also #2169 that contains a few relevant follow-up changes here.

meren commented 8 months ago

This is already sync'd to https://anvio.org/ and appears at https://anvio.org/help/main/programs/anvi-script-reformat-bam/ :)

ivagljiva commented 8 months ago

Beautiful work 🥲