rrwick / Unicycler

hybrid assembly pipeline for bacterial genomes
GNU General Public License v3.0
566 stars 131 forks source link

Odd results when assembling H. pylori #322

Open davidaray opened 1 year ago

davidaray commented 1 year ago

I use Unicycler and these data as part of a class I teach (with attribution, of course).

Here are the class sites for these exercises

https://github.com/davidaray/Genomes-and-Genome-Evolution/wiki/05.-Simple-Bacterial-Genome-Assembly

https://github.com/davidaray/Genomes-and-Genome-Evolution/wiki/06.-Basic-Assembly-Statistics-and-Comparing-Assemblies

I have the students perform assemblies on H. pylori, S. pyogenes, and N gonorrhoeae using various permutations of the data.

I test the software every summer before I teach the class and all has always gone well.

However, this time, I am getting odd results for H. pylori. Instead of a single scaffold when using the short reads and high depth long reads, I'm getting four contigs.

This hasn't happened to me before and I'm using the exact same command line as always.

unicycler \ -1 ../data/helicobacterpylori/short_reads_1.fastq.gz \ -2 ../data/helicobacterpylori/short_reads_2.fastq.gz \ -l ../data/helicobacterpylori/long_reads_high_depth.fastq.gz \ -o hp_long_high \ --keep 3 \ -t 10

The only thing I can think might be different is the installation itself. I always reinstall the software as well, just to make sure that the repositories haven't changed.

Any idea why the assembly is not completing as it should?

davidaray commented 1 year ago

Indeed, I just noticed that the last time I taught this class, I used 0.4.8. This time, the default installation was 0.5.0.

The log files indicate a difference in the dependencies.

Unicycler version: v0.5.0
Using 10 threads

Making output directory:
  /lustre/scratch/daray/gge2023_classwork/unicycler/hp_long_high

Dependencies:
  Program       Version   Status
  spades.py     3.15.5    good  
  racon         1.5.0     good  
  makeblastdb   2.5.0+    good  
  tblastn       2.5.0+    good  
Unicycler version: v0.4.8
Using 10 threads

Making output directory:
  /lustre/scratch/daray/classwork/unicycler/hp_long_high

Dependencies:
  Program         Version   Status  
  spades.py       3.14.1    good    
  racon           -         good    
  makeblastdb     2.5.0+    good    
  tblastn         2.5.0+    good    
  bowtie2-build   2.4.5     good    
  bowtie2         2.4.5     good    
  samtools        1.6       good    
  java            11.0.1    good    
  pilon           1.24      good    
  bcftools                  not used