Closed hoelzer closed 2 months ago
Hi, is it possible to use the recent T2T reference genome instead of the GRCh38 ? Many thanks
Hi, thanks for the hint. Indeed, it would be nice to provide this. Do you have a link where one could download the recent T2T human genome FASTA?
Besides, if you directly want to use it you can download the data yourself and provide it as an --own
input to CLEAN
https://github.com/rki-mf1/clean/blob/711532eb50c059ac199bf386c84be9f409781fc5/clean.nf#L266
I think the main NCBI page for that genome is here: https://www.ncbi.nlm.nih.gov/datasets/genome/GCF_009914755.1/
They even provide a URL to download it, using curl, but I haven't tested it out yet:
https://api.ncbi.nlm.nih.gov/datasets/v2alpha/genome/accession/GCF_009914755.1/download?include_annotation_type=GENOME_FASTA&include_annotation_type=GENOME_GFF&include_annotation_type=RNA_FASTA&include_annotation_type=CDS_FASTA&include_annotation_type=PROT_FASTA&include_annotation_type=SEQUENCE_REPORT&hydrated=FULLY_HYDRATED
Is added now in the latest release.
Ok, is already default :)