aahowel3 / Molecular-Modifiers-TSC

0 stars 0 forks source link

question about GENEINFO tagname and biomaRt package #2

Open ChristopheLegendre opened 2 years ago

ChristopheLegendre commented 2 years ago

https://github.com/aahowel3/Molecular-Modifiers-TSC/blob/594d0ac9edc416a6a7aafc7788ac03e3934cd8e2/filteringcalls_dbsnp_tester_withsnpeff.R#L20

Hi Abigail,

GENEINFO

How do you handle the fact that in one variant record the number of genes can be greater than one? For instance GENEINFO can have the following value: GENE1:321|GENE2:654 or GENE8:xx321|GENE9:xx654|GENE4:987 etc.

I get the following warning because of that issue due to that line referenced above:

Warning message:
In rbind(...) :
  number of columns of result is not a multiple of vector length (arg 3)

At the end I can see that you are only interested in the GENEINFO.X1 column to grab the genename. So why bother to deal with more than one column. I do not know if this is related to the fact that something might not be working for me since I only get at the end in the final table only zero or NA values in the column called score_final.

Question: May you explain how the line L20 referenced above works?

biomaRt?

I have question about biomaRt package: Where do you use any function from that package? I could not see it. Is that package really used?

Thanks Best, Christophe

aahowel3 commented 2 years ago

Hi Chris,

In the genoinfo column generated by snpEFF I've only been considering the first gene annotation, I can have it consider all columns in breaks it up into though to see if those names are present in the mtor gene list file.

Could you send me an example VCF you're using? The issue might be caused by inputting a VEP annotated file rather than snpEFF.

I actually think the biomart library is a remnant of an old script, we can get rid of it.

Best, Abby

On Wed, Aug 24, 2022 at 6:18 PM Chris @.***> wrote:

https://github.com/aahowel3/Molecular-Modifiers-TSC/blob/594d0ac9edc416a6a7aafc7788ac03e3934cd8e2/filteringcalls_dbsnp_tester_withsnpeff.R#L20 https://urldefense.com/v3/__https://github.com/aahowel3/Molecular-Modifiers-TSC/blob/594d0ac9edc416a6a7aafc7788ac03e3934cd8e2/filteringcalls_dbsnp_tester_withsnpeff.R*L20__;Iw!!IKRxdwAv5BmarQ!eyh41QpStDs-OBQAdULWEUC1GpQhHu3BZAhbKAgkfpbtfYJRYb6xAzoHYUJbWNVGgN66IUW2keHbRVs_hguyEicb$

Hi Abigail,

GENEINFO

How do you handle the fact that in one variant record the number of genes can be greater than one? For instance GENEINFO can have the following value: GENE1:321|GENE2:654 or GENE8:xx321|GENE9:xx654|GENE4:987 etc.

I get the following warning because of that issue due to that line referenced above:

Warning message: In rbind(...) : number of columns of result is not a multiple of vector length (arg 3)

It seems to me that something might not be working for me since I only get at the end in the final table only zero or NA values in the column called score_final

biomaRt?

Furthermore, I have question about biomaRt package: Where do you use any function from that package? I could not see it. Is that package really used?

Thanks Best, Christophe

— Reply to this email directly, view it on GitHub https://urldefense.com/v3/__https://github.com/aahowel3/Molecular-Modifiers-TSC/issues/2__;!!IKRxdwAv5BmarQ!eyh41QpStDs-OBQAdULWEUC1GpQhHu3BZAhbKAgkfpbtfYJRYb6xAzoHYUJbWNVGgN66IUW2keHbRVs_hvkwJw7f$, or unsubscribe https://urldefense.com/v3/__https://github.com/notifications/unsubscribe-auth/AEUDKXMFI2UC5PA6ZBJG43LV23CXPANCNFSM57RJEGTA__;!!IKRxdwAv5BmarQ!eyh41QpStDs-OBQAdULWEUC1GpQhHu3BZAhbKAgkfpbtfYJRYb6xAzoHYUJbWNVGgN66IUW2keHbRVs_hs9zXsNc$ . You are receiving this because you are subscribed to this thread.Message ID: @.***>

-- Abigail Howell

Arizona State University | Barrett Honors College | Class of 2017

Biological Sciences (Biomedical Sciences), BS

Ph.D. Candidate, Molecular and Cellular Biology

Arizona State University

@.*** | (480) 292-2575