KolmogorovLab / Wakhan

Haplotype-specific somatic copy number aberrations/profiling from long reads sequencing data
MIT License
29 stars 0 forks source link

write_segments_coverage() missing 1 required positional argument #5

Closed willhooper closed 8 months ago

willhooper commented 8 months ago

Hi, thanks for making an update to resolve my previous issue!

I tried running the updated version but ran into a different error this time. It looks like an argument was added to write_segments_coverage(), but not all the function calls were updated:

Traceback (most recent call last):
  File "/Wakhan/src/main.py", line 294, in <module>
    main()
  File "/Wakhan/src/main.py", line 226, in main
    write_segments_coverage(segments_coverage, 'coverage.csv')
TypeError: write_segments_coverage() missing 1 required positional argument: 'arguments'
tahashmi commented 8 months ago

My mistake. I didn't test it all again only running in dryrun mode, will do a whole test on weekend. By the way just updated with this issue. Also to note we have changed some parameters.

willhooper commented 8 months ago

Thanks for getting back so quickly. I tried again, and ran into this error:

Traceback (most recent call last):
  File "/Wakhan/src/main.py", line 295, in <module>
    main()
  File "/Wakhan/src/main.py", line 250, in main
    coverage_plots_chromosomes(csv_df_coverage, csv_df_phasesets, arguments, thread_pool)
  File "/Wakhan/src/plots.py", line 161, in coverage_plots_chromosomes
    get_snp_segments(arguments, arguments['target_bam'][0], thread_pool)
  File "/Wakhan/src/vcf_processing.py", line 530, in get_snp_segments
    output_pileups = process_bam_for_snps_freqs(arguments, thread_pool)  # TODO Updated
  File "/Wakhan/src/bam_processing.py", line 330, in process_bam_for_snps_freqs
    basefile = pathlib.Path(arguments['phased_vcf']).stem
KeyError: 'phased_vcf'

I think this is related to the parameter change you mentioned (--phased-vcf to --normal-phased-vcf)

tahashmi commented 8 months ago

Thanks. So unfortunate! Just updated.

willhooper commented 8 months ago

Getting farther:

INFO:root:Generating coverage/copy numbers plots genome wide
Traceback (most recent call last):
  File "/Wakhan/src/main.py", line 295, in <module>
    main()
  File "/Wakhan/src/main.py", line 290, in main
    plot_snps_frequencies(arguments, csv_df_snps_mean, df_segs_hp1, df_segs_hp2, centers, integer_fractional_means)
  File "/Wakhan/src/snps_loh.py", line 21, in plot_snps_frequencies
    output_phasesets_file_path = vcf_parse_to_csv_for_snps(arguments['tumor_vcf'])
  File "/Wakhan/src/vcf_processing.py", line 445, in vcf_parse_to_csv_for_snps
    basefile = pathlib.Path(input_vcf).stem  # filename without extension
  File "/opt/conda/envs/wakhan/lib/python3.8/pathlib.py", line 1018, in __new__
    self = cls._from_parts(args, init=False)
  File "/opt/conda/envs/wakhan/lib/python3.8/pathlib.py", line 667, in _from_parts
    drv, root, parts = self._parse_args(args)
  File "/opt/conda/envs/wakhan/lib/python3.8/pathlib.py", line 651, in _parse_args
    a = os.fspath(a)
TypeError: expected str, bytes or os.PathLike object, not NoneType
tahashmi commented 8 months ago

Is --tumor-vcf is passed in input parameters?

willhooper commented 8 months ago

No, but I'm running in tumor/normal mode so my impression was that I didn't need a VCF generated from the tumor

tahashmi commented 8 months ago

It's needed for LOH detection. But I will do it optional as well.

willhooper commented 8 months ago

Just to clarify, the tumor VCF would be the result of running a germline caller (e.g. Clair3) on the tumor?

tahashmi commented 8 months ago

In tumor/normal mode ClairS should output both normal/tumor VCFs.

willhooper commented 8 months ago

Sorry for the late reply -- the tumor VCF is somatic calls then?

tahashmi commented 8 months ago

Yes, somatic calls, I am going to make this parameter required=False.

tahashmi commented 8 months ago

I have updated. If you don't need to detect LOH, only normal phased VCF is enough. Also, If your coverage is low, change bin_size (default=50k) to lower value like 10k.

willhooper commented 8 months ago

Great, thanks for your help! Much appreciated.

tahashmi commented 8 months ago

Hi @willhooper , would like to see genome-wide copy number plots for low coverage data, how Wakhan works on it, because I never tested on such. If possible please share or email me. Thanks!

willhooper commented 8 months ago

@tahashmi just shared the output via email!