Bioconductor / BSgenome

Software infrastructure for efficient representation of full genomes and their SNPs
https://bioconductor.org/packages/BSgenome
9 stars 8 forks source link

New CHM13 genome #22

Closed gevro closed 3 years ago

gevro commented 3 years ago

Hi, Is it possible to create BSgenome packages for the new telomere to telomere human genomes? https://github.com/marbl/CHM13

gevro commented 3 years ago

Note, that I cannot forge it myself, because I get this error:

Error in .make_Seqinfo_from_genome(genome) : 
  "CHM13v1" is not a registered NCBI assembly or UCSC genome (use registered_NCBI_assemblies() or
  registered_UCSC_genomes() to list the NCBI or UCSC assemblies/genomes currently registered in the GenomeInfoDb
  package)

Here is my seed file:

Package: BSgenome.Hsapiens.T2T.CHM13v1
Title: T2T CHM13 v1.0
Description: Full genome sequence for T2T CHM13 v1.0
Version: 1.0
organism: Homo sapiens
common_name: H. Sapiens
provider: T2T
genome: CHM13v1
release_date: 2021/11
source_url: https://github.com/marbl/CHM13
organism_biocview: Homo_sapiens
BSgenomeObjname: Hsapiens
seqnames: c("chr1","chr2","chr3","chr4","chr5","chr6","chr7","chr8","chr9","chr10","chr11","chr12","chr13","chr14","chr15","chr16","chr17","chr18","chr19","chr20","chr21","chr22","chrX","chrM")
seqs_srcdir: /Users/giladevrony/Downloads/chm13/seqs_srcdir
vjcitn commented 3 years ago

Thanks for this report. There will need to be a bit of work in GenomeInfoDb to get this functionality set up. Please be patient. I hope we can turn this around quickly.

hpages commented 3 years ago

Is this an NCBI or UCSC genome? Only NCBI or UCSC genomes can be "registered" in GenomeInfoDb. Having the genome registered means that GenomeInfoDb "knows" the names of the sequences and their circularity flags. So in that case you don't need to specify the seqnames and circ_seqs fields in your seed file. But when the genome is not registered, you need to specify the 2 fields.

gevro commented 3 years ago

Ok. However, I tried that and then I'm getting another error: issue #23