Sabryr / Databases-on-SAGA

Setup databases centrally
MIT License
0 stars 1 forks source link

Suggestions reference genomes #3

Open MonicaSolbakken opened 3 years ago

MonicaSolbakken commented 3 years ago

May I suggest the following genomes which often are used as outgroups or as supplemental information in various biological analyses:

Genomes are available either at GenomeArk or ensembl.org

Model species: Drosophila melanogaster Mouse Mus musculus Human Homo sapiens Rat Rattus norvegicus S. cerevisiae Saccharomyces cerevisiae Zebrafish Danio rerio Chicken Gallus gallus Tropical clawed frog Xenopus tropicalis C. elegans Caenorhabditis elegans

Evolutionary outgroups: Coelacanth Latimeria chalumnae Elephant shark Callorhinchus milii Hagfish Eptatretus burgeri Spotted gar Lepisosteus oculatus Northern pike Esox lucius Lamprey Petromyzon marinus Atlantic cod Gadus morhua

Possible representatives for each of the major vertebrate lineages: Lineage Species name Common name Order Amphibian Bufo bufo common toad Anura Amphibian Geotrypetes seraphini Gaboon caecilian Gymnophiona Birds Aquila chrysaetos chrysaetos european golden eagle Accipitriformes Birds Sterna hirundo common tern Charadriiformes Cartilaginous Scyliorhinus canicula lesser-spotted catshark Carcharhiniformes Cartilaginous Pristis pectinata smalltooth sawfish Pristiformes Cartilaginous Amblyraja radiata Thorny skate Rajiformes Fish Pygocentrus nattereri Red-bellied piranha Characiformes Fish Antennarius maculatus warty frogfish Lophiiformes Mammal Pipistrellus pipistrellus common popistrelle Chiroptera Mammal Tachyglossus aculeatus short-beaked ehidna Monotremata Mammal Sciurus carolinensis grey squirrel Rodentia Reptile Thamnophis elegans Western terrestrial garter snake Squamata Reptile Gopherus evgoodei Goode's thornscrub tortoise Testudines

All the best, Monica Solbakken

Sabryr commented 3 years ago

Thank you for the information. This project is not just to mirror the genomes but to set it up in an a consistent way (with indices etc). We shall include those but I need some more details. I understand this will be too much work, but try to provide as much information you can if not all (you may ask your colleagues to contribute to this issue as well).

If you have access to saga can you please go to the following location (where we have started including references):

/cluster/shared/databases/references/ Then check if the genome you suggest is present. If yes, please check we have setup what you need.

If not, please check the example for Human: /cluster/shared/databases/references/Homo_sapiens

For example /cluster/shared/databases/references/Homo_sapiens/Ensembl is what we have from Ensemble for human.

Then check the format and suggest what to include. For example "Chicken Gallus gallus" find out how the genome is versioned and where the data can be downloaded and give the info here.