federicomarini / ideal

Interactive Differential Expression AnaLysis - DE made accessible and reproducible
https://federicomarini.github.io/ideal/
Other
29 stars 7 forks source link

Id or annotation issue in gene selection post DGE #2

Closed Hoohm closed 5 years ago

Hoohm commented 5 years ago

Hello @federicomarini

Here is the problem I get when using the gene selected from the left menu.

This is the table from below and you can see that the Id column seems fine but that the table also has rownames (I guess).

image

On the left menu, I get this: image

My guess is that the selection is based on the rownames of the table instead of the id column.

I'm trying to look for the bug myself but I'm having a hard time navigating the code since I'm not that used to shiny app code.

I think I'm trying to find where ths variable is instantiated: https://github.com/federicomarini/ideal/blob/f9931316eab54e4fa8ee385da0bc31a4752a7d66/R/ideal.R#L2875

EDIT: for context, I use gene_symbol as my main ID. I guess you use Ensembl ID by default and that is where this is problematic.

Hoohm commented 5 years ago

Adding also that in the functional analysis tab and try to run a GO analysis, I get this warning: Warning: Error in .testForValidKeys: 'keys' must be a character vector I guess this stems from the same issue. Something linked to the ID used.

Hoohm commented 5 years ago

Update: if you don't load the annotation data, this does not occur. Hope this helps

federicomarini commented 5 years ago

Hi Patrick. You are right in guessing it is something with the annotation format.

Indeed the format has to follow these rules:

In summary: just edit slightly the annotation format and it should work

Hoohm commented 5 years ago

So my next question is, would you enforce ensembl IDs as the main ID used in the count matrix? I'm ok with making the change in our pipeline.

federicomarini commented 5 years ago

I'd strongly recommend it because of the uniqueness feature that they kind of guarantee + stability across released versions

Hoohm commented 5 years ago

@federicomarini Tried with having ensembl IDs as the main ID in my counts and now the bug is gone.

Would be wise to "force" users to use ensembl ids by checking if the count columns actually match ensembl ids, fix the bug if the users choose otherwise or make some tools unavailable if using something other than ensembl ids?

federicomarini commented 5 years ago

I have been thinking of enforcing this, but it could cut out some users, or also situations where the dataset cannot have these type of identifiers. Therefore I opted for a more liberal "high recommendation" 😄

Glad you fixed the issue, anyway! F