Closed gianfilippo closed 4 years ago
Hi Gianfilippo,
I have recently encountered this error because the tumour sample name and normal sample name could be identified correctly.
I have resolved this identifying the T and N sample names from the vcf files. Do you happen to know your sample names? They should be your bam file names
Regards, Weitai
On 22 Oct 2019, at 1:05 PM, Gianfilippo Coppola notifications@github.com wrote:
Hi,
I am testing SMuRF on a set of files I generated running the individual callers. I am getting an "Error in normalizeDoubleBracketSubscript". It seems that the expected data type is not there. Are there specific requirements for the input vcfs ?
Thanks Gianfilippo
Below is my command line and the output myresults = smurf(directory = "Variants_hg38_BWA_ensemble/Sample_G1700T_012",mode="combined",nthreads=20,output.dir="Variants_hg38_BWA_ensemble/Sample_G1700T_012",build="hg38",check.packages=T) [1] "SMuRFv1.6 (3rd Oct 2019)" [1] "Saving output files to: Variants_hg38_BWA_ensemble/Sample_G1700T_012" Connection successful!
R is connected to the H2O cluster: H2O cluster version: 3.26.0.2 H2O cluster version age: 2 months and 25 days H2O cluster total nodes: 1 H2O cluster total memory: 26.63 GB H2O cluster total cores: 20 H2O cluster allowed cores: 1 H2O cluster healthy: TRUE H2O API Extensions: Amazon S3, XGBoost, Algos, AutoML, Core V3, Core V4 R Version: R version 3.5.0 (2018-04-23)
Accessing files: Variants_hg38_BWA_ensemble/Sample_G1700T_012/mutect2.vcf.gz Variants_hg38_BWA_ensemble/Sample_G1700T_012/freebayes.vcf.gz Variants_hg38_BWA_ensemble/Sample_G1700T_012/varscan.vcf.gz Variants_hg38_BWA_ensemble/Sample_G1700T_012/vardict.vcf.gz [1] "Parsing step" [1] "reading vcfs" [1] "reading mutect2" [1] "reading freebayes" [1] "reading varscan" [1] "reading vardict" Time difference of 16.48991 secs [1] "extracting calls passed by at least 1 caller" Time difference of 0.82076 secs [1] "extracting meta data from VRanges" Error in normalizeDoubleBracketSubscript(i, x, exact = exact, allow.NA = TRUE, : invalid [[ subscript type: NULL
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHubhttps://github.com/skandlab/SMuRF/issues/35?email_source=notifications&email_token=AENDD5XFY47RP2MIIIWKQWLQP2CXXA5CNFSM4JDKTG5KYY3PNVWWK3TUL52HS4DFUVEXG43VMWVGG33NNVSW45C7NFSM4HTMSERA, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AENDD5SWPAMY6DYV75X4DCDQP2CXXANCNFSM4JDKTG5A.
This e-mail and any attachments are only for the use of the intended recipient and may contain material that is confidential, privileged and/or protected by the Official Secrets Act. If you are not the intended recipient, please delete it or notify the sender immediately. Please do not copy or use it for any purpose or disclose the contents to any other person.
Hi,
thanks.
I can see the sample names (tumor and normal) in each of the 4 files (I have Mutect2, VarSvan2, VarDict, freebayes). They all are from the same sample, but I can see each has a different name. I guess I have to fix that. Also, VarScan used the whole file path as sample names. And in my freebayes vcf I can see an extra column that you do not have in your sample freebayes vcf. How do I get rid of it ?
Thanks
Hi,
I just edited the freebayes vcf and made sure all samples names in the various vcfs are consistent (see below). I am still getting the exact same error.
Do you have any other thought ?
Thanks
Resolving error: Error in normalizeDoubleBracketSubscript(i, x, exact = exact, allow.NA = TRUE, : invalid [[ subscript type: NULL
Cause: vcf sample names for tumour and normal files not detected automatically.
Solution: Manually state your tumor file tag. Example: t.label='-T t.label='_tumor' t.label='T001' t.label='T' #also works for you
Error message: 't.label for tumor sample is not unique, duplicated or missing'
myresults = smurf(directory = "Variants_hg38_BWA_ensemble/Sample_G1700T_012",
mode="combined",
t.label='T_012',
nthreads=20,
output.dir="Variants_hg38_BWA_ensemble/Sample_G1700T_012",
build="hg38",
check.packages=T)
Please download the latest patch SMuRF-v1.6.2. Thanks!
thanks!! I will try this
Hi,
I am testing SMuRF on a set of files I generated running the individual callers. I am getting an "Error in normalizeDoubleBracketSubscript". It seems that the expected data type is not there. Are there specific requirements for the input vcfs ?
Thanks Gianfilippo
Below is my command line and the output myresults = smurf(directory = "Variants_hg38_BWA_ensemble/Sample_G1700T_012",mode="combined",nthreads=20,output.dir="Variants_hg38_BWA_ensemble/Sample_G1700T_012",build="hg38",check.packages=T) [1] "SMuRFv1.6 (3rd Oct 2019)" [1] "Saving output files to: Variants_hg38_BWA_ensemble/Sample_G1700T_012" Connection successful!
R is connected to the H2O cluster: H2O cluster version: 3.26.0.2 H2O cluster version age: 2 months and 25 days
H2O cluster total nodes: 1 H2O cluster total memory: 26.63 GB H2O cluster total cores: 20 H2O cluster allowed cores: 1 H2O cluster healthy: TRUE H2O API Extensions: Amazon S3, XGBoost, Algos, AutoML, Core V3, Core V4 R Version: R version 3.5.0 (2018-04-23)
Accessing files: Variants_hg38_BWA_ensemble/Sample_G1700T_012/mutect2.vcf.gz Variants_hg38_BWA_ensemble/Sample_G1700T_012/freebayes.vcf.gz Variants_hg38_BWA_ensemble/Sample_G1700T_012/varscan.vcf.gz Variants_hg38_BWA_ensemble/Sample_G1700T_012/vardict.vcf.gz [1] "Parsing step" [1] "reading vcfs" [1] "reading mutect2" [1] "reading freebayes" [1] "reading varscan" [1] "reading vardict" Time difference of 16.48991 secs [1] "extracting calls passed by at least 1 caller" Time difference of 0.82076 secs [1] "extracting meta data from VRanges" Error in normalizeDoubleBracketSubscript(i, x, exact = exact, allow.NA = TRUE, : invalid [[ subscript type: NULL