WormBase / pseudoace

Modelling the WormBase ACeDB database in datomic.
4 stars 3 forks source link

Cyclic INXREF causes data loss in migration #46

Closed mgrbyte closed 8 years ago

mgrbyte commented 8 years ago

This was discovered by @azurebrd, the original email report/question the following is added below.

Add a test for cycling IN/OUT XREFs.

I'm trying to query for  gene -> rnai -> phenotype/phenotype-not-observed
and for             gene -> variation -> phenotype/phenotype-not-observed

I've got this working for RNAi

[:find ?rnai ?phen ?pname :in $ :where
                             [?gid :gene/id "WBGene00003883"]
                             [?xref :rnai.gene/gene ?gid]
                             [?rid :rnai/gene ?xref]
                             [?rid :rnai/id ?rnai]
                             [?rid :rnai/phenotype-not-observed ?pid]
                             [?pid :rnai.phenotype-not-observed/phenotype ?pobj]
                             [?pobj :phenotype/id ?phen]
                             [?pobj :phenotype/primary-name ?ppnid]
                             [?ppnid :phenotype.primary-name/text ?pname]
]

Results :
...
["WBRNAi00019306" "WBPhenotype:0000059" "larval arrest"]
["WBRNAi00019306" "WBPhenotype:0001037" "sterile progeny"]
["WBRNAi00002251" "WBPhenotype:0000886" "Variant"]
["WBRNAi00002251" "WBPhenotype:0000050" "embryonic lethal"]

But I can't figure out where the Variation -> Phenotype /
Phenotype_not_observed is

In the models.wrm I see
https://github.com/WormBase/pseudoace/blob/master/models/models.wrm.annot
?Variation Evidence #Evidence
           Description Phenotype ?Phenotype INXREF Variation #Phenotype_info
                       Phenotype_remark ?Text #Evidence
                       Phenotype_not_observed ?Phenotype INXREF Not_in_Variation #Phenotype_info

But in the schemas I don't see the connections
https://github.com/WormBase/pseudoace/tree/develop/generated-schemas

(schema
 phenotype
 (fields
  [id :string :unique-identity]
  [description :ref :component]
  [primary-name :ref :component]
  [synonym :ref :many :component]
  [short-name :ref :many :component]
  [assay :ref :component]
  [remark :ref :many :component]
  [specialisation-of :ref :many]
  [dead :boolean]
  [go-term :ref :many :component]
  [do-term :ref :many :component]))

(schema
 variation
 (fields
  [id :string :unique-identity]
...
  [phenotype-remark :ref :many :component]
  [remark :ref :many :component]))

Did they get put somewhere else ?  Are you able to see the
Variation-Phenotype connections somewhere in datomic ?

Thanks,
Juancarlos
Paul-Davis commented 8 years ago

Thanks @mgrbyte I will look into systematically testing for this inconsistency. I'm currently going through a list :)