honzee / RNAseqCNV

R package for large-scale CNV analysis from RNA-seq
MIT License
11 stars 8 forks source link

totalRNA library results variant filtering #23

Open nerea-bilbao opened 1 year ago

nerea-bilbao commented 1 year ago

Dear Honzee,

This time I am writing to you regarding the info on our variant calling analysis vcf files. We work with total RNA libraries (150x coverage) and we feel our vcf files are quite extensive compared with yours (we guess PolyA libraries). When we run RNAseqCNV our patients show quite unknown structural alterations (marked with question marks). My question is: do you think that with an exhaustive variant filtering will reduce the unknown results? We have thought about the following filters: FILTER:PASS, Minimum allele frequency > 0.05, DP>30, QUAL>20 and protein coding).

We would like to know your opinion,

Thanks a lot,

honzee commented 1 year ago

Hey,

good to hear from you. I understand, that your vcf files could be different.

Could you send me an example image of the output? As the random forest model is working with the information extracted from the graphs, I could better analyze what might be causing the issue.

Best, Jan

nerea-bilbao commented 1 year ago

Dear Jan,

My apologizes for answering that late.

Find attached 2 VCFs (raw and filtered) with their respective RNAseqCNV results in this link: https://ehubox.ehu.eus/s/XH3AJKHyRAeW49R

Thank you for your help,

Nere,

El mié, 28 jun 2023 a las 13:38, honzee @.***>) escribió:

Hey,

good to hear from you. I understand, that your vcf files could be different.

Could you send me an example image of the output? As the random forest model is working with the information extracted from the graphs, I could better analyze what might be causing the issue.

Best, Jan

— Reply to this email directly, view it on GitHub https://github.com/honzee/RNAseqCNV/issues/23#issuecomment-1611244415, or unsubscribe https://github.com/notifications/unsubscribe-auth/A3SZASWAPRXTJKXJSKU4H5TXNQJRRANCNFSM6AAAAAAZW6ISM4 . You are receiving this because you authored the thread.Message ID: @.***>

honzee commented 1 year ago

Hi,

I am sorry for the late answer, I was away. Based on the images you have sent me, I don't think the VCF files are the problem. On the contrary, I think the results look quite good for the MAF's. The '?' in your results are there, because the predictions for those CNV's had lower confidence. The "?" sign tells you that you should consider visually checking, whether the prediction matches what you would expect for that MAF and expression pattern.

Hope this helps. Best, Jan