vloux / ProteoRE

GNU General Public License v3.0
2 stars 5 forks source link

new component : Retrieve Expression Info from HPA #71

Closed combesf closed 6 years ago

combesf commented 6 years ago

@NguyenLien This is specifications for new component which will allow to retrieve data from HPA without any input required.

The component has to be as follow :

User 1st select the Tissue(s) from a drop-down menu (58 tissues in the data table) Then the level among 4 choices (checkbox) : "High", "Medium", "Low" or "Not detected" Then (again) the reliability among 4 choices (checkbox) : "Approved" "Uncertain" "Enhanced" "Supported"

Then the user submits, and the component returns as a result the selection of the tab. See the R function 'select.HPAimmunohisto' of the txt file attached.

->if choice was "RNAseq data" User has to choose between the values of the "Sample" column (37 values) via a drop-down menu and the component returns the result of the 'select.HPARNAseq' R function (see txt file attached to this issue)

RetrieveExpressionInfofromHPA.zip

NguyenLien commented 6 years ago

@combesf I put this new component to dev instance so you can test it now :)

combesf commented 6 years ago

Thanks @NguyenLien

I am sorry but it has to be modified. The choice for "Keep and annotate genes present in the following tissue(s)" has to be a drop-down menu in which the user can select 1 or more choices. Is this possible ? If not : radiobutton to check could be a solution.

As it is now, the user first sees an empty box, that is not very user-friendly :-(

combesf commented 6 years ago

SORRY @NguyenLien I did not see it, but it IS a drop-down menu (on my browser it is not visible) so please ignore my precedent post and I continue to test ....

NguyenLien commented 6 years ago

That actually a drop down menu, when you click on the empty box, you can choose the tissue you want. And we can set an default value so that instead of empty, it will show that option in the box.

yvandenb commented 6 years ago

Well, very promising...:-) indeed the selection of tissue could be made more straitghtforward, let's say more intuitive to the end-user...Is there any graphical way in terms of GUI, to improve this ? Any idea?
I also tested using "RNAseq" as a source file: I think a column indicating the "Unit" (here TPM, column 5) is useless as it is this unit is uniform in the original file (but to be verified!) => I'd therefore rename the column "Value" into "Value (TPM unit)" and remove the column "Unit"

NguyenLien commented 6 years ago

As far as I know, the select parameter can now be displayed as drop-down (current one), checkboxes or radio button.

vloux commented 6 years ago

Radio means that we can only select one. I agree that the way drop down menus are implemented in galaxy (on has to click first in the panel to be able to see it) is not that much intuitive, but it is Galaxy's choices. The only way to work around that s to have checkboxes, like for expression values. But the list will be rather long ...

combesf commented 6 years ago

Would it be possible (dream is still possible.. not ? ;-) ) to display an image of a human body and, clicking on the organs, it selects it ?? something like this ... Is it technically complicated ? I can manage for the image if needed

yvandenb commented 6 years ago

Agree: checkboxes not a good idea...drop down is ok at the moment: would it be possible to let a tissue name appearing (in an "unselect all" mode) to suggest users there is something there to be selected/listed... Human body image (good idea by the way) may be more demanding a considering our timeline...

vloux commented 6 years ago

@yvandenb : for me one tissue is selected by defaut (Adrenal Gland). @combesf : good idea, i think everything is feasible, but we have to think on how to do it !

combesf commented 6 years ago

The same for me : by default "adrenal gland" is selected. Instead, is it possible to display something like "please click to choose a tissue" ?

NguyenLien commented 6 years ago

Sorry but I don't think any suggestions is supported by Galaxy.. we can have only drop-down, checkboxes or radio.. And the sentence "Please select..." can only be the label or help. The box can display a name of option (like the option in filter keyword) but only if it allows to choose 1 option (not multiple)..

yvandenb commented 6 years ago

Submission form, changes requested:

  1. Replace "Retrieve Information from HPA retrieves data from HPA without any input required " by "Retrieve tissue-specific expression data from HPA (Human Protein Atlas) - no input required" 2 Replace "Please choose from which experimental data source you want to retrieve data" by "Please choose experimental data source (antibody- or RNAseq-based)
  2. Source file to be selected: Rename "based on immunohistochemisty using tissue micro arrays" into "Expression profiles based on immunohistochemisty" ; Rename "RNAlevel from RNAseq data" into "RNA levels based on RNA-seq"
  3. Replace "Keep and annotate genes present in the following tissue(s)" by "Select tissue by clicking the dropdown menu below"
  4. Replace "expression value" by "Expression level" ; by default only "High" pre-selected
  5. "The gene reliability of the expression value" by "Reliability score"
  6. Change the order of "expression level" labels into the following: Enhanced ; Supported ; Approved ; Uncertain.
  7. By default only "Enhanced" and "Supported" pre-selected

User doc coming soon ;-)

yvandenb commented 6 years ago

Use doc section (as a proposal): This tool allows to retrieve information from Human Protein Atlas (https://www.proteinatlas.org/) regarding the expression profiles of human genes both at the mRNA and protein levels without any input required. It could be used to:

The resources from Human Protein Atlas that can be queried are the following:

  1. Human normal tissue data: expression profiles for proteins in human tissues based on immunohistochemisty using tissue micro arrays measured in 58 tissues and 82 cell types. The tab-separated file includes Ensembl gene identifier ("Gene"), tissue name ("Tissue"), annotated cell type ("Cell type"), expression value ("Level"), and the gene reliability of the expression value ("Reliability score"). The reliability score is divided into Enhanced, Supported, Approved, or Uncertain with the following definitions:
  1. RNA levels based on RNA-seq data: RNA levels measured in 64 cell lines and 37 tissues based on RNA-seq experiments. The tab-separated file includes Ensembl gene identifier ("Gene"), analysed sample ("Sample") and transcripts per million ("Value" and "Unit"). The data is based on The Human Protein Atlas version 18 and Ensembl version 88.38. Fro more information: https://www.proteinatlas.org/about/help
yvandenb commented 6 years ago

As a reminder, remains to be done : => I also tested using "RNAseq" as a source file: I think a column indicating the "Unit" (here TPM, column 5) is useless as it is this unit is uniform in the original file (but to be verified!) => I'd therefore rename the column "Value" into "Value (TPM unit)" and remove the column "Unit"

NguyenLien commented 6 years ago

Nearly forgot it, thank you @yvandenb !

yvandenb commented 6 years ago

Welcome Lien ;-) Once done, please let me know for re-checking and, if OK, deploiement on the V1.0 the new "Retrieve annotation from DB" sub-section...

NguyenLien commented 6 years ago

@yvandenb the modification is available in dev instance now. Ok from my test.

yvandenb commented 6 years ago

Tool tested and now OK...green light for deployment (Toolshed + proteore.org), thank you

yvandenb commented 6 years ago

As usual, please keep me inform once this tool deployed (and tool menu updated) so that I can test on proteore.org before closing this issue

NguyenLien commented 6 years ago

@yvandenb The ToolShed and proteore.org is up-to-date now, the tool panel to be managed before tomorrow for you to test !

yvandenb commented 6 years ago

Thank you Lien :-) I'm eager to test this brand new release tomorrow !

vloux commented 6 years ago

tests Ok