merenlab / anvio

An analysis and visualization platform for 'omics data
http://merenlab.org/software/anvio
GNU General Public License v3.0
427 stars 145 forks source link

Single-cell genome with contamination - Is merging necessary? #591

Closed Klcross closed 7 years ago

Klcross commented 7 years ago

Hello,

I am working with some single-cell genomes trying to learn from scratch anvio. I have ran into some difficulties in the process that maybe you could clarify on?

1) I have been following the "Tutorial for Metagenomic Work" mostly to try and get started. After generating my .bam files after sorting and indexing using BWA I get a .bam.bai file as noted would be the case if you already generated these files as the sorted and indexed file is what you want. However for the anvi-profile step it does not want to recognize a .bai file but just the .bam file. If I use the .bam file post-sorting/pre-indexing will this still work or is the indexing necessary?

2) Since I only have one .bam file can I skip the anvi-merge step? It was a little unclear in trying to compare with the "Removing contaminants from cultivars" post but during the anvi-profile step if I use --cluster-contigs then I have to skip the merge (automatically skipping the hierarchical clustering in the process) and go straight to anvi-interactive?

3) The result of my anvi-profile with --cluster-contigs skipping the merge is three files: AUXILIARY-DATA.h5, PROFILE.db, and RUNLOG.txt. I do not get a RUNINFO.cp but is this because I use the flag? Therefore I end up using "anvi-interactive -p path/PROFILE.db -c path/contigs.db" instead which gets me to the interactive interface which looks like attached and says "unable to connect to remote host" which I am assuming is still in part my fault in a step someone.

screen shot 2017-09-19 at 4 31 54 pm

Thank you for the feedback as I power through this.

Karissa

meren commented 7 years ago

Hi Karissa,

The discussion group is more appropriate for these kinds of questions. GitHub issues are mostly for technical problems in the code.

However for the anvi-profile step it does not want to recognize a .bai file but just the .bam file. If I use the .bam file post-sorting/pre-indexing will this still work or is the indexing necessary?

Please try and see.

Since I only have one .bam file can I skip the anvi-merge step?

Yes, it is only relevant when you have multiple single profiles.

It was a little unclear in trying to compare with the "Removing contaminants from cultivars" post but during the anvi-profile step if I use --cluster-contigs then I have to skip the merge (automatically skipping the hierarchical clustering in the process) and go straight to anvi-interactive?

Yes. You can then run anvi-interactive on your single profile.

The result of my anvi-profile with --cluster-contigs skipping the merge is three files: AUXILIARY-DATA.h5, PROFILE.db, and RUNLOG.txt. I do not get a RUNINFO.cp but is this because I use the flag?

anvi'o no longer generates RUNINFO.cp. So it is normal that you don't have it.

Therefore I end up using "anvi-interactive -p path/PROFILE.db -c path/contigs.db" instead which gets me to the interactive interface which looks like attached and says "unable to connect to remote host" which I am assuming is still in part my fault in a step someone.

You should either running anvi-interactive on your own computer after downloading all the necessary files (contigs.db, contigs.h5, profile.db, and auxiliary-data.h5). Or you can try to follow this solution:

http://merenlab.org/2015/11/28/visualizing-from-a-server/

Best,

Klcross commented 6 years ago

Wonderful, thank you. I will direct my discussions to the appropriate avenue next time. Ill give this another try!

Karissa

From: "A. Murat Eren" notifications@github.com<mailto:notifications@github.com> Reply-To: merenlab/anvio reply@reply.github.com<mailto:reply@reply.github.com> Date: Tuesday, September 19, 2017 at 8:57 PM To: merenlab/anvio anvio@noreply.github.com<mailto:anvio@noreply.github.com> Cc: Karissa Cross crosskl@ornl.gov<mailto:crosskl@ornl.gov>, Author author@noreply.github.com<mailto:author@noreply.github.com> Subject: Re: [merenlab/anvio] Single-cell genome with contamination - Is merging necessary? (#591)

Hi Karissa,

The discussion group is more appropriate for these kinds of questions. GitHub issues are mostly for technical problems in the code.

However for the anvi-profile step it does not want to recognize a .bai file but just the .bam file. If I use the .bam file post-sorting/pre-indexing will this still work or is the indexing necessary?

Please try and see.

Since I only have one .bam file can I skip the anvi-merge step?

Yes, it is only relevant when you have multiple single profiles.

It was a little unclear in trying to compare with the "Removing contaminants from cultivars" post but during the anvi-profile step if I use --cluster-contigs then I have to skip the merge (automatically skipping the hierarchical clustering in the process) and go straight to anvi-interactive?

Yes. You can then run anvi-interactive on your single profile.

The result of my anvi-profile with --cluster-contigs skipping the merge is three files: AUXILIARY-DATA.h5, PROFILE.db, and RUNLOG.txt. I do not get a RUNINFO.cp but is this because I use the flag?

anvi'o no longer generates RUNINFO.cp. So it is normal that you don't have it.

Therefore I end up using "anvi-interactive -p path/PROFILE.db -c path/contigs.db" instead which gets me to the interactive interface which looks like attached and says "unable to connect to remote host" which I am assuming is still in part my fault in a step someone.

You should either running anvi-interactive on your own computer after downloading all the necessary files (contigs.db, contigs.h5, profile.db, and auxiliary-data.h5). Or you can try to follow this solution:

http://merenlab.org/2015/11/28/visualizing-from-a-server/

Best,

- You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://github.com/merenlab/anvio/issues/591#issuecomment-330714525, or mute the threadhttps://github.com/notifications/unsubscribe-auth/Aen5Hgdot-OMbWg4V7fjazEDr5bDBg_Yks5skGLigaJpZM4Pc-Vd.