Closed Angel030331 closed 1 month ago
Hi,
Could you please let us know what version of SigProfiler packages you have installed? I suspect there is an issue with your VCF input files. Could you please share a snippet of the input that reproduces your error? You can e-mail it to me, my contact info is on the README.
Thanks!
-------Python and Package Versions------- Python Version: 3.9.0 SigProfilerMatrixGenerator Version: 1.2.28 SigProfilerPlotting version: 1.3.24 matplotlib version: 3.5.3 statsmodels version: 0.14.2 scipy version: 1.13.1 pandas version: 1.4.4 numpy version: 1.23.2
I have emailed you the code snippets, thanks
Please see the following information
-------Python and Package Versions------- Python Version: 3.9.0 SigProfilerMatrixGenerator Version: 1.2.28 SigProfilerPlotting version: 1.3.24 matplotlib version: 3.5.3 statsmodels version: 0.14.2 scipy version: 1.13.1 pandas version: 1.4.4 numpy version: 1.23.2 [image: Screenshot 2024-09-11 at 00.47.06.png][image: Screenshot 2024-09-11 at 00.47.13.png]
Yours sincerely, Angel Wong On Ki, Faculty of Science, The University of Hong Kong
mdbarnesUCSD @.***> 於 2024年9月11日 週三 上午12:46寫道:
Hi,
Could you please let us know what version of SigProfiler packages you have installed? I suspect there is an issue with your VCF input files. Could you please share a snippet of the input that reproduces your error? You can e-mail it to me, my contact info is on the README.
Thanks!
— Reply to this email directly, view it on GitHub https://github.com/AlexandrovLab/SigProfilerMatrixGenerator/issues/196#issuecomment-2341463752, or unsubscribe https://github.com/notifications/unsubscribe-auth/AYMSSTZHN5HYDMQFQRLQ4CDZV4O5FAVCNFSM6AAAAABN7FFEBOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGNBRGQ3DGNZVGI . You are receiving this because you authored the thread.Message ID: @.*** com>
Hi @Angel030331,
I contacted you for more information about the input VCF that you are running with. I have not heard back yet, but am going to share an example VCF file that you can find on our wiki page. Please see the file on the wiki page called example.vcf.
To whom it may concern,
Sorry for the late reply as I missed the email. Please see the attached files. These are the vcf files in the input directory. Please also see the error message as attached. I apologize again and thank you so much for your time and efforts. HG008_T_N-P_HiFi_GRCh38-GIABv3_DeepVariant_snv.vcf https://drive.google.com/file/d/1X3EHmYovW4P79klfe2zLxc0XIcHnIw4z/view?usp=drive_web HG008_T_N-P_Ilmn_GRCh38-GIABv3_ClairS_snv.vcf https://drive.google.com/file/d/1Og6KvaPTbwePyiTEFKpr5auzAAMwkNOf/view?usp=drive_web HG008_T_N-P_Ilmn_GRCh38-GIABv3_Dragen_snv.vcf https://drive.google.com/file/d/1M6gDbARYJaHwAar6sKha6uWRujLR0aFd/view?usp=drive_web HG008_T_N-P_ONT_GRCh37_ClairS_snv.vcf https://drive.google.com/file/d/1RAgufBcEyN3pZK2bHrQUXAslbF6Zrt5g/view?usp=drive_web [image: image.png] [image: image.png] Yours sincerely, Angel Wong On Ki, Faculty of Science, The University of Hong Kong
mdbarnesUCSD @.***> 於 2024年9月14日 週六 上午4:48寫道:
Hi @Angel030331 https://github.com/Angel030331,
I contacted you for more information about the input VCF that you are running with. I have not heard back yet, but am going to share an example VCF file that you can find on our wiki page. Please see the file on the wiki page https://osf.io/s93d5/wiki/3.%20Using%20the%20Tool%20-%20SBS,%20ID,%20DBS%20Input/ called example.vcf.
— Reply to this email directly, view it on GitHub https://github.com/AlexandrovLab/SigProfilerMatrixGenerator/issues/196#issuecomment-2350137302, or unsubscribe https://github.com/notifications/unsubscribe-auth/AYMSST2S2XKW3KHKLTL5FJDZWNFTFAVCNFSM6AAAAABN7FFEBOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGNJQGEZTOMZQGI . You are receiving this because you were mentioned.Message ID: @.*** com>
Hi @Angel030331,
I tested creating matrices with your input files and these seem to be the source of the issue.
The Dragen file contains non-canonical chromosomes (ie chr_Un...), you will want to filter these out prior to your run. Additionally, the HG008_T_N-P_ONT_GRCh37_ClairS_snv.vcf input file is GRCh37 and not GRCh38 like the rest of your files in this analysis. There were no issues with matrix generation for HG008_T_N-P_Ilmn_GRCh38-GIABv3_ClairS_snv.vcf.
Thanks @mdbarnesUCSD Thank you so much for the feedback. I will check them out.
Please reach out if you have any additional issues.
Analyze.cosmic_fit(samples='/autofs/bal34/okwong/cancer_stat_proj/SigProfiler/lab_data', output="/autofs/bal34/okwong/cancer_stat_proj/SigProfiler/lab_data/lab_data_test_GRCh37_10092024", input_type="vcf", context_type="96", genome_build="GRCh38", cosmic_version=3.4, make_plots=True, verbose=True)
The code above results in the error below:
File /autofs/bal36/zxzheng/env/conda/envs/mamba/envs/somatic/lib/python3.9/site-packages/SigProfilerMatrixGenerator/scripts/MutationMatrixGenerator.py:451, in catalogue_generator_single(lines, chrom, mutation_dict, mutation_dinuc_pd_all, mutation_types_tsb_context, vcf_path, vcf_path_original, vcf_files, bed_file_path, chrom_path, project, output_matrix, context, exome, genome, ncbi_chrom, functionFlag, bed, bed_ranges, chrom_based, plot, tsb_ref, transcript_path, tsb_stat, seqInfo, gs, log_file, volume) 448 mut_seq += previous_mut 450 for l in range(start1 + 1, start2, 1): --> 451 mnv_seq += tsb_ref[chrom_string[l - 1]][1] 452 mut_seq += tsb_ref[chrom_string[l - 1]][1] 454 if i < len(mnv_index) - 1:
IndexError: index out of range
There is a 'input' directory in the '/autofs/bal34/okwong/cancer_stat_proj/SigProfiler/lab_data' directory that contains the vcfs