monarch-initiative / dipper

Data Ingestion Pipeline for Monarch
https://dipper.readthedocs.io/en/latest/
BSD 3-Clause "New" or "Revised" License
56 stars 26 forks source link

Mapping of ZFIN terms and relationships #93

Closed bryanlaraway closed 9 years ago

bryanlaraway commented 9 years ago

References #12 .

I'm hunting around for some relationship terms for this resource, particularly between genes and various markers/features in the gene_marker_relationship.txt file: -clone contains gene -coding sequence of -contains polymorphism -gene contains small segment -gene encodes small segment -gene has artifact -gene hybridized by small segment -gene produces transcript -gene product recognized by antibody -knockdown reagent targets gene - targets_instance_of (GENO:0000414) -promoter of -transcript targets gene - targets_instance_of (GENO:0000414) (Doesn't seem quite right, as GENO:0000414 seems specific to gene targeting reagents.)

Also need a few for the feature_affected_gene.txt file: -markers missing - corresponds to SO:1000029 - chromosomal deletion, with the relationship being described as 'this feature (chromosomal deletion) X this gene.' Would a simple 'deletes' work here? -markers moved - corresponds to SO:1000199 - translocation, with the relationship being similar to 'this feature (translocation) X this gene.' Translocates?

Last (for now) is the relationship between mutagens and mutagees in the features.txt file. How should these relationships be modeled? Mutagens: CRISPR, EMS, ENU, DNA, g-rays, not specified, spontaneous, TALEN, TMP, zinc finger nuclease. Mutagees: adult females, adult males, embryos, not specified, sperm.

nlwashington commented 9 years ago

(quoted from above)

I'm hunting around for some relationship terms...

I don't think we need any of those relationships in our transformation (right now)

markers missing

you would be creating a sequence alteration which is a deletion (given the SO id that they provide). that sequence alteration is_sequence_variant_of the gene. no new relationship required. it is the same for the entire table, i imagine. same applies for the moved gene. longer term, however, is that we'll have to properly model translocations (as in, we'll need to add the location information, but that might actually come from another file).

relationship between mutagens and mutagees

we don't really need this right now.