mikolmogorov / Ragout

Chromosome-level scaffolding using multiple references
Other
146 stars 27 forks source link

Methods question with Ragout #81

Closed ChuChuChaddy closed 1 year ago

ChuChuChaddy commented 1 year ago

Hello, it's me again! New genome, new problems.

We have a Chromium 10x assembly that went through Supernova, Arcs, and Tigmint. We are trying to annotate it, but the high fragmentation of the genome is getting in the way. Until more funds are available, I'm trying to improve it using existing data and software. The problem I'm running into is that there are no chromosome level genomes for my organism, but there is an assembly for a closely related species. There are two other genomes for my organism, one being equivalent to ours (short reads, allpaths) and the other being improved scaffolding (PacBio). Per my last issue, you informed me that running Ragout with fewer genomes is ideal. My dilemma is what would be the best option to run ragout with? Our goal is to improve the assembly, annotate, and use for the identification of genes using RNA-seq.

-Chromosome level assembly of a closely related species -Less fragmented scaffold level assembly of our organisms (PacBio) -similarly fragment scaffolds from another group (I do not think this is the best solution).

Or would I be better served to use my current assembly + closely related species + pacbio assembly?

Our current assembly: length: 601,361,278 n50: 8764 contigs: 180908

Closely related species: length: 850,633,761 n50: 48,536,009 contigs: 201

Same organism, but pacbio: length: 914.9 Mb n50: 7.3 kb contigs: 180908

Thanks for your time. I appreciate any insight you all can offer.

Cheers,

mikolmogorov commented 1 year ago

Hi,

I would try the chromosome-level assembly as a reference first. Then, you can try to add either of the extra ones and see if it makes an improvement.

Misha