Open dportik opened 1 year ago
On the "raw" side [^1] there are both GRCh38.p14 and T2T-CHM13v2.0 signatures in wort, would that work?
[^1]: just downloaded the data and calculated a signature, no other pre-processing like repeat masking
Yep! Those should be plenty.
Repo to sketch hg38, including all unmapped chromosomes: https://github.com/ctb/2024-human-sketch
note: decontaminating human WGS samples, https://github.com/sourmash-bio/sourmash/issues/3151
download at: https://farm.cse.ucdavis.edu/~ctbrown/sourmash-db/hg38/hg38-entire.sig.zip
Hi Titus et al, Given the recent fiasco related to mapping reads to microbial databases without human references (links at bottom), it might be a good time to create a small human genome database for use with sourmash. A standalone database on the database page would be ideal, so that researchers can include with the other databases of interest.
Thanks for considering!
social media discussion: https://twitter.com/StevenSalzberg1/status/1686350449069244416 pre-print: https://doi.org/10.1101/2023.07.28.550993