marbl / metAMOS

A metagenomic and isolate assembly and analysis pipeline built with AMOS
http://marbl.github.io/metAMOS
Other
93 stars 45 forks source link

I have some question about using metamos #234

Closed kds2923 closed 8 years ago

kds2923 commented 8 years ago

Hi there, I have some question about metamos results and process.

  1. I would like to know the meaning of '+sample1' in proba1 N50 plot graph of FindORFs section.
  2. According to the paper of metamos, I could select orf and contig in annotation step. However, the result only shows one and there is no option in command line. Is there any special way to select orf and contig?
  3. In the example from metamos website 'http://www.cbcb.umd.edu/~sergek/imetamos/gageb/Postprocess/out/html/summary.html', the preprocess page is inculded in kmergenie report. However, I could not find a command option to use kmergenie. Is there any way to link the two results? Also I would like to know if the kmers that are selected in kmergenie affect assembly and validation steps.
  4. When executing classify step using Blast, sometimes all results are 'unknown.' However, proba.classify.txt from postprocess folder shows good matching to the species. I want to know why there is discordance between the two. I would also like to know if I could control level of classify from final summary.html (ex.genus, family)
  5. Can I use mate-pair reads as an input? If I adjust insert size as multi-input, does it automatically recognize the input?
  6. There is an error in scaffold step. The error message is

|2015-12-02 00:44:49| /mnt/lustre/tools/metAMOS-1.5rc3/AMOS/Linux-x86_64/bin/OrientContigs -minRedundancy 5 -all -redundancy 10 -b /mnt/lustre/home/kds2923/meta_shotgun/Reseq_test/reanalysis/Result2/Scaffold/in/proba.bnk -repeats /mnt/lustre/home/kds2923/meta_shotgun/Reseq_test/reanalysis/Result2/Scaffold/in/proba.reps

Last 10 lines of output (/mnt/lustre/home/kds2923/meta_shotgun/Reseq_test/reanalysis/Result2/Logs/SCAFFOLD.log) FOR SKIPPED EDGE 201049 SET EDGE STATUS TO BE 6 FOR SKIPPED EDGE 212600 SET EDGE STATUS TO BE 6 FOR SKIPPED EDGE 212645 SET EDGE STATUS TO BE 5 FOR SKIPPED EDGE 215616 SET EDGE STATUS TO BE 5 FOR SKIPPED EDGE 220181 SET EDGE STATUS TO BE 6 FOR SKIPPED EDGE 223872 SET EDGE STATUS TO BE 6 FOR SKIPPED EDGE 227350 SET EDGE STATUS TO BE 6 FOR SKIPPED EDGE 229444 SET EDGE STATUS TO BE 6 FOR SKIPPED EDGE 229511 SET EDGE STATUS TO BE 5 FOR SKIPPED EDGE 232367 SET EDGE STATUS TO BE 6

Is there any way to solve this error except skipping the scaffold step?

skoren commented 8 years ago
  1. There is some support in the html scripts for doing comparative plots between multiple assemblies. The +sample1 just indicates you have only one sample in your plot.
  2. This depends on the tool used for annotation. Most use sequencing reads/contigs but you could use phmmer or blast which would use orfs instead.
  3. Kmergenie is available and run automatically (as long as you have R installed) if you do not specify a k-mer size to runPipeline and you're running in isolate mode (-W iMetAMOS). It is not available for metagenomic datasets.
  4. Unknown or top-level? The import script will merge multiple classifications to get the LCA so even if you have good hits to species for a contig, the LCA could be top-level. The HTML report is interactive and you can navigate to any taxonomic level you want.
  5. Yes, you just need to specify the -o flag if they are the Illumina-style "outtie" mates. I'm not sure what you mean by multi-input.
  6. I didn't see an error in that output, it could be one of the open issues with scaffolding in which case skipping is the only option.
skoren commented 8 years ago

Closed, inactivity.