honzee / RNAseqCNV

R package for large-scale CNV analysis from RNA-seq
MIT License
11 stars 8 forks source link

Error with dplyr/mutate #11

Closed Kaddea closed 2 years ago

Kaddea commented 2 years ago

Happy new year, everyone!!

I've freshly installed the package and encounter the following error when running the script. Is there anything odd I have to look for in the vcf files?

Thank you, Mathias

Reading in vcf file.. Extracting depth.. Extracting reference allele and alternative allele depths.. Needed information from vcf extracted Finished reading vcf Error: Problem with mutate() column snvOrd. i snvOrd = 1:n(). i snvOrd must be size 0 or 1, not 2. Run rlang::last_error() to see where the error occurred. rlang::last_error() x +-<error/dplyr:::mutate_error> | Problem with mutate() column snvOrd. | i snvOrd = 1:n(). | i snvOrd must be size 0 or 1, not 2. -<error/dplyr:::mutate_incompatible_size> Backtrace:

  1. RNAseqCNV::RNAseqCNV_wrapper(...)
  2. dplyr:::abort_glue(character(0), list(x_size = 2L), "dplyr:::mutate_incompatible_size")
  3. rlang::exec(abort, class = class, !!!data) Run rlang::last_trace() to see the full context. rlang::last_trace() x +-<error/dplyr:::mutate_error> | Problem with mutate() column snvOrd. | i snvOrd = 1:n(). | i snvOrd must be size 0 or 1, not 2. -<error/dplyr:::mutate_incompatible_size> Backtrace: x
  4. +-RNAseqCNV::RNAseqCNV_wrapper(...)
  5. | -RNAseqCNV:::calc_chrom_lvl(smpSNPdata.tmp)
  6. | -%>%(...)
  7. +-dplyr::mutate(., chr = factor(chr, levels = c(1:22, "X")))
  8. +-dplyr::ungroup(.)
  9. +-dplyr::mutate(...)
  10. +-dplyr::filter(., snvOrd <= 1000)
  11. +-dplyr::mutate(., snvOrd = 1:n())
  12. +-dplyr:::mutate.data.frame(., snvOrd = 1:n())
  13. | -dplyr:::mutate_cols(.data, ..., caller_env = caller_env())
  14. | +-base::withCallingHandlers(...)
  15. | -mask$eval_all_mutate(quo)
  16. -dplyr:::abort_glue(character(0), list(x_size = 2L), "dplyr:::mutate_incompatible_size")
  17. -rlang::exec(abort, class = class, !!!data)
honzee commented 2 years ago

Hi Matthias,

I am really glad you decided to try our package. I hope we can get the issues fixed asap.

It seems there is an issue at the point of ordering SNVs according to their position on chromosomes. Perhaps we could first check, whether the vcf is in a format compatible with RNAseqCNV. I have copied the first few lines of a functioning vcf file without a header (is not used by RNAseqCNV). You can check whether everything fits.

CHROM POS ID REF ALT QUAL FILTER INFO FORMAT File1

1 10560 . C G 21.77 . AC=2;AF=1.00;AN=2;DP=2;ExcessHet=3.0103;FS=0.000;MLEAC=2;MLEAF=1.00;MQ=60.00;QD=10.88;SOR=0.693 GT:AD:DP:GQ:PL 1/1:0,2:2:6:49,6,0 1 14653 . C T 97.28 . AC=2;AF=1.00;AN=2;DP=3;ExcessHet=3.0103;FS=0.000;MLEAC=2;MLEAF=1.00;MQ=60.00;QD=32.43;SOR=2.833 GT:AD:DP:GQ:PL 1/1:0,3:3:9:125,9,0 1 14907 . A G 220.28 . AC=1;AF=0.500;AN=2;BaseQRankSum=-1.429;ClippingRankSum=0.000;DP=11;ExcessHet=3.0103;FS=0.000;MLEAC=1;MLEAF=0.500;MQ=60.00;MQRankSum=0.000;QD=20.03;ReadPosRankSum=-0.549;SOR=0.223 GT:AD:DP:GQ:PL 0/1:1,10:11:9:248,0,9

Kaddea commented 2 years ago

Thank you for the fast reply :)

The chromosome labeling might be the issue here, as my vcf files refer to them as NC_000001 to NC_000024. I probably should convert them to the form 1 to Y. Every other hint is welcome, too ...

vcf snippet:

fileformat=VCFv4.2

... (lots of comments)

contig=

source=HaplotypeCaller

CHROM POS ID REF ALT QUAL FILTER INFO FORMAT S008-4WQRE6-X1-R3

NC_000001.11 14464 rs546169444 A T 365.63 . AC=1;AF=0.500;AN=2;BaseQRankSum=-1.556;DB;DP=11;ExcessHet=3.0103;FS=0.000;MLEAC=1;MLEAF=0.500;MQ=60.00;MQRankSum=0.000;QD=33.24;ReadPosRankSum=1.897;SOR=3.442 GT:AD:DP:GQ:PL 0/1:1,10:11:12:373,0,12 NC_000001.11 14522 rs1441808061 G A 1004.60 . AC=1;AF=0.500;AN=2;BaseQRankSum=5.350;DB;DP=44;ExcessHet=3.0103;FS=3.212;MLEAC=1;MLEAF=0.500;MQ=60.00;MQRankSum=0.000;QD=22.83;ReadPosRankSum=2.478;SOR=1.828 GT:AD:DP:GQ:PL 0/1:21,23:44:99:1012,0,621

honzee commented 2 years ago

Yes, the chromosome naming is an issue, since RNAseqCNV assumes naming 1-22, X, Y

If anything else comes up do not hesitate to contact me.

Best, Jan

Kaddea commented 2 years ago

Works great. Thanks a lot!!

honzee commented 2 years ago

Great, I am glad to hear that.

Best, Jan ---------- Původní e-mail ---------- Od: Kaddea @.> Komu: honzee/RNAseqCNV @.> Datum: 4. 1. 2022 8:59:28 Předmět: Re: [honzee/RNAseqCNV] Error with dplyr/mutate (Issue #11) "

Works great. Thanks a lot!!

— Reply to this email directly, view it on GitHub (https://github.com/honzee/RNAseqCNV/issues/11#issuecomment-1004595378), or unsubscribe (https://github.com/notifications/unsubscribe-auth/ANPG3NT4TOOQDM4WKS2FE5LUUKSFXANCNFSM5LFT3VGA) . Triage notifications on the go with GitHub Mobile for iOS (https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675) or Android (https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub) . You are receiving this because you commented. Message ID: <honzee/RNAseqCNV/ @.***> "