legumeinfo / mine-issues

Report ALL issues on LIS mines here! Regardless of which mine you found it on!
2 stars 0 forks source link

Request New Template: Retrieve Gene Model positions by Gene Model Name #143

Closed jd-campbell closed 8 months ago

jd-campbell commented 8 months ago

I would like to request a new template to replace a tool on SoyBase.

The new template can be called: Gene --> Location Start / Stop This new template will retrieve the sequence coordinates for a list of soybean gene calls. A table containing the chromosome and the beginning and end positions (in bp on the chromosome) for each gene call will be returned.

I messed around with the Query Builder and here is the screen shot of what I can up with:

Screen Shot 2023-12-28 at 13 08 00

adf-ncgr commented 8 months ago

Thanks @jd-campbell; I guess @sammyjava will take care of this but just to be clear I think you are not intending this to only work for gnm4.ann1 genes, right? In the template context, the constraints on annotation and assembly versions can be made optional which would might be the best choice if you want to allow people to restrict the outputs but want to keep it flexible enough to support mixed sets of input genes. FWIW here is the template XML that accomplishes that:

<template name="Genes_Locations" title="Genes -&gt; Locations" comment="" dataTypes="java.lang.String java.lang.String java.lang.Integer java.lang.Integer java.lang.String">
  <query name="Genes_Locations" model="genomic" view="Gene.primaryIdentifier Gene.chromosomeLocation.locatedOn.primaryIdentifier Gene.chromosomeLocation.start Gene.chromosomeLocation.end Gene.chromosomeLocation.strand" longDescription="" sortOrder="Gene.primaryIdentifier asc" constraintLogic="A and B and C">
    <constraint path="Gene" code="A" editable="true" op="IN" value="Gene list for all organisms 28 Dec 2023 14.11"/>
    <constraint path="Gene.annotationVersion" code="B" editable="true" description="" switchable="off" op="=" value="ann1"/>
    <constraint path="Gene.assemblyVersion" code="C" editable="true" description="" switchable="off" op="=" value="gnm1"/>
  </query>
</template>
sammyjava commented 8 months ago

This is one of those funky InterMine things that make sense once you wrap your head around it. There is no distinction in the webap UI between selecting a gene by name or selecting a list of genes. Lists and singletons are treated equally in the path query framework front end. SO, I've created the following template: https://mines.legumeinfo.org/glycinemine/template.do?name=gene_list_locations&scope=all but it doesn't really make sense since you can enter a gene identifier and if you don't have any lists you won't see the lists below the field in which the default is "select_a_gene_list_below".

So I'd say put a proper gene identifier in the default, and when folks have a list containing the root class (Gene) it'll always show as an option.

In other words, the template is really Gene --> Locations. Having a list of genes is on the user.

I also didn't put in annotation or assembly version filters, since when one has a list one has usually applied a filter when creating a list. BUT, if you really want to filter genes FROM a list by annotation/assembly, I can add those filters. I think they add confusion.

sammyjava commented 8 months ago

I can add instructions: "Enter a gene identifier, or select a list of genes that you have already stored. Default:Glyma.01G005400."

adf-ncgr commented 8 months ago

agreed that it is weird to have the default value be "select_a_gene_list_below"; otherwise looks good to me.

sammyjava commented 8 months ago

I did this update. And now I see why you wanted an annotation selector, for multiple annotations with the same gene name when you enter a name instead of selecting a list. So I threw an assembly version filter in there. We are (I am) thinking about NOT supporting multiple annotations of the same assembly in the mines, so I'll leave that off for now. (I don't think the confusion produced from having multiple genes from the same assembly with the same name, but different locations, is warranted.) Closing, but reopen if you want me to change something.

https://mines.legumeinfo.org/glycinemine/template.do?name=gene_locations&scope=global