ablab / spades

SPAdes Genome Assembler
http://ablab.github.io/spades/
Other
763 stars 138 forks source link

Assemble of DNA and RNA amplicons from PacBio data #1070

Closed s-t-calus closed 1 year ago

s-t-calus commented 1 year ago

Is your feature request related to a problem? Please describe. For generic questions use Q&A section in the Discussions forum above.

Dear Mr/Mrs,

I am trying to combine data from two PacBio runs: DNA (10kb) + cDNA (10kb), these regions overlap each other partially and I would like to phase them based on short-tandem repeats and exonic SNP/fingerprint to genomic exon+intron SNP/fingerprints to identify intronic SNP and a number of tandem repeats that are separated by >150kb from each other. cDNA was used to bring STR's + exonic SNP closer to each other.

Do you think SPADES could assemble DNA (intronic and exonic) + RNA/cDNA reads (exonic) into a graph where we could: phase the number of STR's + exonic SNP (cDNA) with intronic and exonic fingerprints (DNA) based on 99-100% similarity?

If yes, which of the packages I should use, if not do you have any idea what software I should consider using?

Thank you very much and please let me know if you have any questions. S-T-C

Describe the solution you'd like

Apply 99-100% similarity to assemble DNA and cDNA data based on fingerprints and maintain the SNP information to avoid data collapsing and fake results.

Describe alternatives you've considered

UCLUST or VSEARCH for OTU binning

Additional context

No response

asl commented 1 year ago

Hello

SPAdes is designed and intended to be used on Illumina-provided data. It cannot be used on PacBio data alone. You could try other software packages, e.g. miniasm, etc.

s-t-calus commented 1 year ago

Great thank you, how about OTU binning with VSEACH or UCLUST?

asl commented 1 year ago

how about OTU binning with VSEACH or UCLUST?

No idea. You'd better ask in the relevant forums