molikd / otb

Only The Best (Genome Assembly Tools)
Other
5 stars 3 forks source link

Find a better small test dataset #39

Closed Astahlke closed 2 years ago

Astahlke commented 2 years ago

Soemthing from the hifasm git repo?

molikd commented 2 years ago

Semi-relatedly, testing is currently taking place on 38fda7671731d702b66777090516c5c8f030eca0

molikd commented 2 years ago

should probably include Hi-C data as well.

https://www.ncbi.nlm.nih.gov/sra?LinkName=biosample_sra&from_uid=20203776 <- small but I can't find hic

or maybe this one:

https://www.ncbi.nlm.nih.gov/assembly/GCA_002140095.1

PacBio reads belonging to this assembly came from SRA runs SRR5413213- SRR5413218. Illumina HiC reads were deposited in SRA accession SRR5413221

problem with this one is that it is a lot of raw data for a tiny genome.

molikd commented 2 years ago

The more I think about this lets just see if there is anything we have that is tiny and already published

molikd commented 2 years ago

Scott was saying that the indian meal moth (Pint) or Tribolium might be a good choice.

molikd commented 2 years ago

https://canu.readthedocs.io/en/latest/quick-start.html#quickstart https://github.com/PacificBiosciences/DevNet/wiki/E.-coli-Bacterial-Assembly

molikd commented 2 years ago

I think we are going to need two e coli for HifiASM and then something with HiFi and HiC

molikd commented 2 years ago

see

/project/ag100pest/otb_test

on ceres

molikd commented 2 years ago

see:

SRR5413221 for HiC see SRR5413216 and SRR5413218 for HiFi

currently in /project/ag100pest/otb_test/otb