MPIIComputationalEpigenetics / DeepBlue

DeepBlue Epigenomic Data Server
Other
2 stars 0 forks source link

get expressions is not returning any results #143

Closed mlist closed 8 years ago

mlist commented 8 years ago

I tried to download gene expression data per example of the R vignette:

genes_names = c('CCR1', 'CD164', 'CD1D', 'CD2', 'CD34', 'CD3G', 'CD44') deepblue_select_gene_expressions( sample_ids="s10205", genes = genes_names, gene_model = "gencode v23")

The result is empty. I'm not sure if it's an R related problem but when I manually inspect the XML-RPC output it's empty. Check query id q300031 and request id r174076 to see what I mean.

I also tried using ENSG ids as well as a different sample, didn't work either.

felipealbrecht commented 8 years ago

Just for sake: Did you use your own user key?

mlist commented 8 years ago

Not at first, but then I tried again with user key. That didn't help. The vignette example should anyway not require a key, right?

Best, Markus

Felipe Albrecht notifications@github.com schrieb am Mi., 31. Aug. 2016, 17:54:

Just for sake: Did you use your own user key?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/MPIIComputationalEpigenetics/DeepBlue/issues/143#issuecomment-243809541, or mute the thread https://github.com/notifications/unsubscribe-auth/ABVg3avAZecQruQorkDudRKDIOUb6Uekks5qlaM7gaJpZM4JxYTN .

felipealbrecht commented 8 years ago

We only have gene expression data for DEEP, that is a private project.

mlist commented 8 years ago

You mean only deep has RNAseq gene expression, right? There is plenty of array data.

But I have access and it didn't work with my key. Does it work for you?

Felipe Albrecht notifications@github.com schrieb am Mi., 31. Aug. 2016, 18:52:

We only have gene expression data for DEEP, that is a private project.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/MPIIComputationalEpigenetics/DeepBlue/issues/143#issuecomment-243827795, or mute the thread https://github.com/notifications/unsubscribe-auth/ABVg3YeWRsQ1FKfq_LNohCg_9qUGsjC9ks5qlbESgaJpZM4JxYTN .

felipealbrecht commented 8 years ago

Sorry, I dont understand your question, but:

mlist commented 8 years ago

To clarify :

DeepBlue also lists gene expression data for the ENCODE project, for instance. These data are all measured using microarrays instead of next generation sequencing but would ideally also be accessible through select gene expression, since it can and should be mapped to the gene model.

Felipe Albrecht notifications@github.com schrieb am Mi., 31. Aug. 2016, 19:07:

Sorry, I dont understand your question, but:

-

Lets define the gene expression data as the data from fpkm files, from the Deep project. This data is not listed with the usual experiments data, but with the list_gene_expression data. It is because this data must be mapped to a gene model.

I found a bug on the select_gene_expression data and I am working on it. You can see that it always return the same query id, even when the gene names are different.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/MPIIComputationalEpigenetics/DeepBlue/issues/143#issuecomment-243832039, or mute the thread https://github.com/notifications/unsubscribe-auth/ABVg3YZIaJiMdMmoJeUcLVThpYfR8M3hks5qlbR6gaJpZM4JxYTN .

felipealbrecht commented 8 years ago

No, this data is already mapped to a genomic regions. The list_gene_expressions data are for data that weren't mapped to genomic regions yet.

mlist commented 8 years ago

Let's discuss this tomorrow or on Friday.

Felipe Albrecht notifications@github.com schrieb am Mi., 31. Aug. 2016, 19:13:

No, this data is already mapped to a genomic regions. The list_gene_expressions data are for data that weren't mapped to genomic regions yet.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/MPIIComputationalEpigenetics/DeepBlue/issues/143#issuecomment-243834143, or mute the thread https://github.com/notifications/unsubscribe-auth/ABVg3Y4KuK0uc-D-3kfZqMjW8BeJY1i3ks5qlbYvgaJpZM4JxYTN .

felipealbrecht commented 8 years ago

Sorry, but I cant implement it. The data is there and accessible. Sorry, I dont have enough time to implement it.

felipealbrecht commented 8 years ago

in progress:

genes_names = c('CCR1', 'CD164', 'CD1D', 'CD2', 'CD34', 'CD3G', 'CD44')

user_key = "rbZYVnLKIJeBCZTO"

query_id = deepblue_select_gene_expressions( sample_ids="s10197", genes = genes_names, gene_model = "gencode v23", user_key=user_key)

request_id = deepblue_get_regions(query_id, "TRACKING_ID,GENE_ID,GENE_SHORT_NAME,FPKM,FPKM_CONF_LO,FPKM_CONF_HI,FPKM_STATUS" , user_key=user_key)

deepblue_download_request_data(request_id, user_key)

felipealbrecht commented 8 years ago

DeepBlue is having problem with the gene: CD34 mapping to the gencode v23.

Explanation:

Which of the two options is correct or more robust?

mlist commented 8 years ago

The problem is that DEEP used gencode v19 for mapping these genes. There is no guarantee that the gene symbol will lead to a match since they also change sometimes. It might be a good idea to automatically test the unmapped genes against an older version and to produce a warning / error. If too complicated producing an error is fine if there is also an option

ignore.missing to override this error. What do you think?

Markus List

Postdoc, Computational Biology & Applied Algorithmics Max-Planck Institute for Informatics Campus E1 4, Room 516, 66123 Saarbrücken, Germany E-mail: markus.list@mpi-inf.mpg.de Phone: +49 681 93253016

On Wed, 31 Aug 2016 at 22:35 Felipe Albrecht notifications@github.com wrote:

DeepBlue is having problem with the gene: CD34 mapping to the gencode v23.

Explanation:

  • From the DEEP data, this gene has the tracking/ID ENSG00000174059.12, that does not exist in the gencode v23. (Look at http://deepblue.mpi-inf.mpg.de/dashboard.php#ajax/deepblue_view_genes.php)
  • I have to option here:
    • return an error, saying that it is not possible to map (the current option)
    • use the gene name CD34 for mapping the gene expression data to the coordinates.

Which of the two options is correct or more robust?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/MPIIComputationalEpigenetics/DeepBlue/issues/143#issuecomment-243892521, or mute the thread https://github.com/notifications/unsubscribe-auth/ABVg3Sxg_MgUaTMtcrjPFkdER4up-2szks5qleVxgaJpZM4JxYTN .

felipealbrecht commented 8 years ago

The implemented solution is:

It is already in the live version and the previous example is working.

Thank you for this bug report!