Open bwalsh opened 8 years ago
Very useful Brian.
Some comments:
Mark
Brian notifications@github.com writes:
Team:
The schemas have received feedback that the semantics of SearchGenotypePhenotypeRequest are very unclear. In this section of the api documentation that applies to our schema, I've added some examples and guidance.
We are proposing deprecating the un-scoped string that is used in the query and replacing it with a scoped TermQuery. We hope to introduce this, along with adding a placeholder for external identifiers in Evidence and PhenotypeInstance along with our current pull request. Your comments are invaluable. readme
In addition, we have also been asked to consider a PhenotypeAssociation which has a wider scope; it connects evidence to entities other than Feature. Here we propose a new entrypoint that follows the modified pattern of the G2P and adds phenotype/search. This allows for discover of evidence associated with (Variant,FeatureEvent,BioSample,Individual,CallSet). Again, your comments will be useful. readme
— You are receiving this because you are subscribed to this thread. Reply to this email directly or view it on GitHub*
If I understand this correctly, I think we should be concerned about clashing of unscoped identifiers. For example, I read this as supporting something like { 'phenotype': ['FH'] }, in which I think it's unclear whether that's FH the gene (via "ExternalIdentifierQuery") or Familial Hypercholesterolemia (via "PhenotypeQuery"). Is that (or something like it) a valid concern here?
https://github.com/ga4gh/schemas/pull/432#issuecomment-189512499
The semantics of SearchGenotypePhenotypeRequest are very unclear. I would really have no idea how to construct a query.
https://github.com/ga4gh/schemas/pull/432#discussion_r54935254
record EvidenceQuery {
/**
only those fields from evidence that are `queryable`
*/
union { null, OntologyTerm } evidenceType;
union { null, string } description = null; /*regex*/
union { null, array<org.ga4gh.models.ExternalIdentifier> } externalIdentifiers = null; /* new field */
}
record FeatureQuery {
/**
only those fields from feature that are `queryable`
*/
union { null, string } name; /* new field, regex */
union { null, string } description; /* new field,regex */
union { null, string } featureSetId;
union { null, string } referenceName;
union { null, long } start = 0;
union { null, long } end;
union { null, Strand } strand;
union { null, OntologyTerm } type; /* new field */
union { null, OntologyTerm } featureType;
union { null, array<org.ga4gh.models.ExternalIdentifier> } externalIdentifiers = null; /* new field */
}
record PhenotypeQuery {
/**
only those fields from phenotype that are `queryable`
*/
union { null, OntologyTerm } type;
union { null, array<OntologyTerm> } qualifier = null;
union { null, OntologyTerm } ageOfOnset = null;
union { null, string } description = null; /*regex*/
union { null, array<org.ga4gh.models.ExternalIdentifier> } externalIdentifiers = null; /* new field */
}
regex: The regular expression language is defined in XQuery 1.0 and XPath 2.0 Functions and Operators section 7.6.1 Regular Expression Syntax.
One criticism of the current API is that it is overloaded, it violates a design goal of separation of concerns. Specifically it combines the search for evidence with search for features & search for genotypes
This proposal move search,alias matching and external identifiers lookup to dedicated end points.
POST phenotypes/search PhenotypeQuery
POST features/search FeatureQuery
The SearchGenotypePhenotype search is simplified. Features and Phenotypes are expressed as a simple array of string identifiers . Evidence can be queried via the new EvidenceQuery.
record SearchGenotypePhenotypeRequest {
...
union {null, array<string> } featureIds = null;
union {null, array<string> } phenotypeIds = null;
union {null, array<EvidenceQuery> } evidence = null;
...
}
G2P servers are implemented in three different contexts:
Flexible representation of Feature
show associations for this feature
POST feature/[id]/associations FeatureAssociationQuery
Feature
PhenotypeAssociationSet}]show associations for this phenotype
POST phenotype/[id]/associations PhenotypeAssociationQuery
EntityName
PhenotypeAssociationSet}]
Consider instead a PhenotypeAssociation which has a wider scope; the objects it connects and the evidence type determines the meaning of the association
POST [EntityName]/[id]/associations [EntityName]AssociationQuery
EntityName
PhenotypeAssociationSet}]Id Searches: Feature Lookup
| Q: I have a featureId ("f12345").
| Create a SearchGenotypePhenotypeRequest
| {… "featureIds" : ["f12345"] … }
| The system will respond with evidence for features that match on that identifier
| Q: I only want somatic variant features SO:0001777
how do I limit results?
| Create a FeatureQuery, specify featureType
| POST to feature/search
| The client then would use those feature.id to construct a SearchGenotypePhenotypeRequest
| The system will respond with features that match on that type
| Q: I have a SNPid ("rs6920220"). | Create a FeatureQuery.ids | POST to feature/search | The system will respond with features that match on external identifier. | The client then would use those feature.id to construct a SearchGenotypePhenotypeRequest | Dependency: external_ids to be added to Feature.ids
| Q: I have an identifier for BRCA1 GO:0070531
how do I query for feature?
| Create a FeatureQuery.type
| POST to feature/search
| The system will respond with features that match on ontology term.
| The client then would use those feature.id to construct a SearchGenotypePhenotypeRequest
| Dependency: ontologies to be added to Feature.type
Id Searches: Phenotype Lookup
| Q: I have a phenotype id (“p12345”)
| Create a SearchGenotypePhenotypeRequest
| {..., "phenotypeIds": ["p12345"],...}
| The system will respond with evidence that match on PhenotypeInstance.id
| Q: I have a Disease ontology id ("http://www.ebi.ac.uk/efo/EFO_0003767"). | POST PhenotypeQuery.type to phenotype/search | The system will respond with phenotypes that match on OntologyTerm.id | The client then would use those phenotype.id to construct a SearchGenotypePhenotypeRequest
| Q: I have an ontology term for a phenotype (HP:0001507, 'Growth abnormality' ), how do I query it? | POST PhenotypeQuery.qualifier to phenotype/search | The system will respond with phenotypes that match on OntologyTerm.id | The client then would use those phenotype.id to construct a SearchGenotypePhenotypeRequest
| Q: I am only interested in phenotypes qualified with (PATO_0001899, decreased circumference
)
| POST PhenotypeQuery.qualifier to phenotype/search
| The system will respond with phenotypes whose qualifiers that match that ontology 'is_a'
| The client then would use those phenotype.id to construct a SearchGenotypePhenotypeRequest
| Q: I am only interested in phenotypes with ageOfOnset of (HP:0003581, adult onset
)
| POST PhenotypeQuery.ageOfOnset to phenotype/search
| The system will respond with phenotypes whose ageOfOnset that match
| The client then would use those phenotype.id to construct a SearchGenotypePhenotypeRequest
@bwalsh - I like the proposed simplification of the SearchGenotypePhenotypeRequest to accept phenotype ids and feature ids.
The SeqAnn schema already has a features/search, so a new endpoint like FeatureQuery is not required. Adding ExternalIdentifiers to Feature and a Feature GET by ExternalIdentifier endpoint makes sense. This requirement has already been raised: https://github.com/ga4gh/schemas/issues/578.
Team:
The schemas have received feedback that the semantics of SearchGenotypePhenotypeRequest are very unclear. In this section of the api documentation that applies to our schema, I've added some examples and guidance.
We are proposing deprecating the un-scoped string that is used in the query and replacing it with a scoped
TermQuery
. We hope to introduce this, along with adding a placeholder for external identifiers in Evidence and PhenotypeInstance along with our current pull request. Your comments are invaluable. readmeIn addition, we have also been asked to consider a PhenotypeAssociation which has a wider scope; it connects evidence to entities other than Feature. Here we propose a new entrypoint that follows the modified pattern of the G2P and adds phenotype/search. This allows for discover of evidence associated with (Variant,FeatureEvent,BioSample,Individual,CallSet). Again, your comments will be useful. readme