Open weix-cshl opened 5 years ago
Asked Bruno about adopting these process to CSHL. Here is the answer:
"unfortunately the current version of the genome loader talks directly to ENA databases and for that reason that repo is private.
The moment that is replaced to calls to REST end points you will be free to use it of course, "
Bruno is using the following scripts to load the sweet cherry genome and gene annotations. These scripts are more automated and compliant with EBI metadata. To be consistent with Epl, Gramene may want to adopt the same process. The scripts he used are
Genome Loader https://github.com/Ensembl/plant_tools/blob/master/core/load_genome_shell.pl This script directly import genome assembly and meta data from ENA database and load into ensembl core database, I assume it takes care of the global variables in ensembl_production.
GFF loader https://github.com/Ensembl/plant_tools/blob/master/core/load_GFF_hive.pl He had to fix the GFF file (contig names, pseudogenes, RecName descriptions, no mRNA features) before use this script, this script will initiate an ehive pipeline to do the loading, it was recommended to run on a execution nodes not a login node.
Run the core analyses He then annotate repeats, xrefs, metadata and run the healthchecks,