warelab / gramene-ensembl

Gramene's Ensembl plugins, extensions, configuration.
MIT License
2 stars 0 forks source link

Research on sweet cherry genome for future release #1

Open weix-cshl opened 5 years ago

weix-cshl commented 5 years ago

A user suggested gramene to include sweet cherry genome. Investigate if there is a sweet cherry genome that meets Epl-Grmene inclusion criteria. ie, published, genome deposited in INSDC, gene annotation available.

weix-cshl commented 5 years ago

ENA has two sweet cherry (Prunus avium) genomes,

  1. Genome assembled by a Chinese group: 1,500 scaffold. MinION platform of Oxford Nanopore thechnology
    1. Genome assembled by a Japanese group: 10,000 scaffolds. Illumina whole-genome shotgun sequencing technology.

However the Chinese genome does not see to have released any gene annotations, while Japanese genome has published a paper with a lot of data including gene annotation and variation data.

Japanese one PAV_r1.0: https://www.ebi.ac.uk/ena/data/view/BDGV01000000

Genome-Assembly-Data-START

Assembly Method :: SOAPdenovo v. 2-rev240; GapCloser v. 1.10 Assembly Name :: PAV_r1.0 Genome Coverage :: 327x Sequencing Technology :: Illumina HiSeq 2000

Genome-Assembly-Data-END

Annotation download from

https://www.rosaceae.org/species/prunus_avium/genome_v1.0.a1

ftp://ftp.bioinfo.wsu.edu/species/Prunus_avium/Prunus_avium-genome.v1.0.a1/genes/

https://www.rosaceae.org/analysis/235

NCBI gene annotation for Japanese one PAV_r1.0 https://www.ncbi.nlm.nih.gov/genome/annotation_euk/Prunus_avium/100/ Japanese annotation released with the assembly in the paper http://europepmc.org/abstract/MED/28541388 We determined the genome sequence of sweet cherry (Prunus avium) using next-generation sequencing technology. The total length of the assembled sequences was 272.4 Mb, consisting of 10,148 scaffold sequences with an N50 length of 219.6 kb. The sequences covered 77.8% of the 352.9 Mb sweet cherry genome, as estimated by k-mer analysis, and included >96.0% of the core eukaryotic genes. We predicted 43,349 complete and partial protein-encoding genes. A high-density consensus map with 2,382 loci was constructed using double-digest restriction site-associated DNA sequencing. Comparing the genetic maps of sweet cherry and peach revealed high synteny between the two genomes; thus the scaffolds were integrated into pseudomolecules using map- and synteny-based strategies. Whole-genome resequencing of six modern cultivars found 1,016,866 SNPs and 162,402 insertions/deletions, out of which 0.7% were deleterious. The sequence variants, as well as simple sequence repeats, can be used as DNA markers. The genomic information helps us to identify agronomically important genes and will accelerate genetic studies and breeding programs for sweet cherries. Further information on the genomic sequences and DNA markers is available in DBcherry (http://cherry.kazusa.or.jp (8 May 2017, date last accessed)).