beanumber / tidy-databases

materials for ASA webinar on using databases in the tidyverse
1 stars 0 forks source link

examples from ensembl #4

Closed nicholasjhorton closed 6 years ago

nicholasjhorton commented 6 years ago

mysql --host=ensembldb.ensembl.org --port=3306 --user=anonymous -p

no password

https://www.ensembl.org/info/data/mysql.html

beanumber commented 6 years ago

OK, I was able to connect, but I have no idea what these data are. Do you have a use case?

nicholasjhorton commented 6 years ago

No, but I'm working on it.

use sus_scrofa_variation_79_102; show tables; select * from publication select 10;

https://www.ncbi.nlm.nih.gov/pubmed/25662601

East Balkan Swine (EBS) Sus scrofa is the only aboriginal domesticated pig breed in Bulgaria and is distributed on the western coast of the Black Sea in Bulgaria. To reveal the breed's genetic characteristics, we analysed mitochondrial DNA (mtDNA) and Y chromosomal DNA sequences of EBS in Bulgaria. Nucleotide diversity (πn ) of the mtDNA control region, including two newly found haplotypes, in 54 EBS was higher (0.014 ± 0.007) compared with that of European (0.005 ± 0.003) and Asian (0.006 ± 0.003) domestic pigs and wild boar. The median-joining network based on the mtDNA control region showed that the EBS and wild boar in Bulgaria comprised mainly two major mtDNA clades, European clade E1 (61.3%) and Asian clade A (38.7%). The coexistence of two mtDNA clades in EBS in Bulgaria may be the relict of historical pig translocation. Among the Bulgarian EBS colonies, the geographical differences in distribution of two mtDNA clades (E1 and A) could be attributed to the source pig populations and/or historical crossbreeding with imported pigs. In addition, analysis of the Y chromosomal DNA sequences for the EBS revealed that all of the EBS had haplotype HY1, which is dominant in European domestic pigs.

nicholasjhorton commented 6 years ago

mysql> show tables; +---------------------------------------+ | Tables_in_sus_scrofa_variation_79_102 | +---------------------------------------+ | allele | | allele_code | | associate_study | | attrib | | attrib_set | | attrib_type | | compressed_genotype_region | | compressed_genotype_var | | coord_system | | display_group | | failed_allele | | failed_description | | failed_structural_variation | | failed_variation | | genotype_code | | individual | | individual_genotype_multiple_bp | | individual_population | | individual_synonym | | individual_type | | meta | | meta_coord | | motif_feature_variation | | phenotype | | phenotype_feature | | phenotype_feature_attrib | | population | | population_genotype | | population_structure | | population_synonym | | protein_function_predictions | | protein_function_predictions_attrib | | publication | | read_coverage | | regulatory_feature_variation | | seq_region | | source | | strain_gtype_poly | | structural_variation | | structural_variation_association | | structural_variation_feature | | structural_variation_sample | | study | | submitter_handle | | subsnp_handle | | subsnp_map | | tagged_variation_feature | | transcript_variation | | translation_md5 | | variation | | variation_attrib | | variation_citation | | variation_feature | | variation_genename | | variation_hgvs | | variation_set | | variation_set_structural_variation | | variation_set_structure | | variation_set_variation | | variation_synonym | +---------------------------------------+ 60 rows in set (0.10 sec)

nicholasjhorton commented 6 years ago

mysql> select * from publication limit 10; +----------------+------------------------------------------------------------------------------------------------------------------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------+----------+------------+------+------------------------------+---------+ | publication_id | title | authors | pmid | pmcid | year | doi | ucsc_id | +----------------+------------------------------------------------------------------------------------------------------------------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------+----------+------------+------+------------------------------+---------+ | 1 | A combination of two variants in PRKAG3 is needed for a positive effect on meat quality in pigs. | Uimari P, Sironen A. | 24580963 | PMC3943410 | 2014 | 10.1186/1471-2156-15-29 | NULL | | 2 | RNA deep sequencing reveals novel candidate genes and polymorphisms in boar testis and liver tissues with divergent androstenone levels. | Gunawan A, Sahadevan S, Neuhoff C, Große-Brinkhaus C, Gad A, Frieden L, Tesfaye D, Tholen E, Looft C, Uddin MJ, Schellander K, Cinar MU. | 23696805 | PMC3655983 | 2013 | 10.1371/journal.pone.0063259 | NULL | | 3 | TLR4 single nucleotide polymorphisms (SNPs) associated with Salmonella shedding in pigs. | Kich JD, Uthe JJ, Benavides MV, Cantão ME, Zanella R, Tuggle CK, Bearson SM. | 24566961 | PMC3990860 | 2014 | 10.1007/s13353-014-0199-8 | NULL | | 4 | Population history and genomic signatures for high-altitude adaptation in Tibetan pigs. | Ai H, Yang B, Li J, Xie X, Chen H, Ren J. | 25270331 | PMC4197311 | 2014 | 10.1186/1471-2164-15-834 | NULL | +----------------+------------------------------------------------------------------------------------------------------------------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------+----------+------------+------+------------------------------+---------+

https://www.ncbi.nlm.nih.gov/pubmed/24580963

BMC Genet. 2014 Feb 28;15:29. doi: 10.1186/1471-2156-15-29. A combination of two variants in PRKAG3 is needed for a positive effect on meat quality in pigs. Uimari P1, Sironen A. Author information Abstract BACKGROUND: Color and pH of meat measured 24 h post mortem are common selection objectives in pig breeding programs. Several amino acid substitutions in PRKAG3 have been associated with various meat quality traits. In our previous study ASGA0070625, a SNP next to PRKAG3, had the most significant association with meat quality traits in the Finnish Yorkshire. However, the known amino acid substitutions, including I199V, did not show any association. The aims of this study were to characterize further variation in PRKAG3 and its promoter region, and to test the association between these variants and the pH and color of pork meat.

RESULTS: The data comprised of 220 Finnish Landrace and 230 Finnish Yorkshire artificial insemination boars with progeny information. We sequenced the coding and promoter region of PRKAG3 in these and in three additional wild boars. Genotypes from our previous genome-wide scans were also included in the data. Association between SNPs or haplotypes and meat quality traits (deregressed estimates of breeding values from Finnish national breeding value estimation for pH, color lightness and redness measured from loin or ham) was tested using a linear regression model. Sequencing revealed several novel amino acid substitutions in PRKAG3, including K24E, I41V, K131R, and P134L. Linkage disequilibrium was strong among the novel variants, SNPs in the promoter region and ASGA0070625, especially for the Yorkshire. The strongest associations were observed between ASGA0070625 and the SNPs in the promoter region and pH measured from loin in the Yorkshire and between I199V and pH measured from ham in the Landrace. In contrast, ASGA0070625 was not significantly associated with meat quality traits in the Landrace and I199V not in the Yorkshire. Haplotype analysis showed a significant association between a haplotype consisting of 199I and 24E alleles (or g.-157C or g.-58A alleles in the promoter region) and pH measured from loin and ham in both breeds (P-values varied from 1.72 × 10⁻⁴ to 1.80 × 10⁻⁸).

CONCLUSIONS: We conclude that haplotype g.-157C - g.-58A - 24E - 199I in PRKAG3 has a positive effect on meat quality in pigs. Our results are readily applicable for marker-assisted selection in pigs.

PMID: 24580963 PMCID: PMC3943410 DOI: 10.1186/1471-2156-15-29