songbowang125 / SVision-pro

GNU General Public License v3.0
25 stars 3 forks source link

AssertionError: group argument must be None for now #7

Closed QianZixi closed 1 month ago

QianZixi commented 1 month ago

Hello, thank you for developing SVision pro, such a valuable tool. It has been very helpful to us! But I encountered the following problems while using it. Do you know what caused it?

2024-07-14 17:28:39,967 [INFO]    ****************** Step2 lite-Unet predicting **************************
Traceback (most recent call last):
  File "/home/user/xxx/miniconda3/envs/xxx/bin/SVision-pro", line 4, in <module>
    __import__('pkg_resources').run_script('SVision-pro==1.8', 'SVision-pro')
  File "/home/user/xxx/miniconda3/envs/xxx/lib/python3.9/site-packages/pkg_resources/__init__.py", line 722, in run_script
    self.require(requires)[0].run_script(script_name, ns)
  File "/home/user/xxx/miniconda3/envs/xxx/lib/python3.9/site-packages/pkg_resources/__init__.py", line 1561, in run_script
    exec(code, namespace, namespace)
  File "/home/user/xxx/miniconda3/envs/xxx/lib/python3.9/site-packages/SVision_pro-1.8-py3.9.egg/EGG-INFO/scripts/SVision-pro", line 279, in <module>
    process_pool = NoDaemonPool(max(1, int(options.process_num / options.unet_cpu_num)))
  File "/home/user/xxx/miniconda3/envs/xxx/lib/python3.9/multiprocessing/pool.py", line 212, in __init__
    self._repopulate_pool()
  File "/home/user/xxx/miniconda3/envs/xxx/lib/python3.9/multiprocessing/pool.py", line 303, in _repopulate_pool
    return self._repopulate_pool_static(self._ctx, self.Process,
  File "/home/user/xxx/miniconda3/envs/xxx/lib/python3.9/multiprocessing/pool.py", line 319, in _repopulate_pool_static
    w = Process(ctx, target=worker,
  File "/home/user/xxx/miniconda3/envs/xxx/lib/python3.9/multiprocessing/process.py", line 82, in __init__
    assert group is None, 'group argument must be None for now'
AssertionError: group argument must be None for now
songbowang125 commented 1 month ago

It appears that an assertion error in the multiprocessing module of Python might be caused by version incompatibilities. Did you install SVision-pro following the provided instructions?

QianZixi commented 1 month ago

Thank you for your response. I reinstalled SVision-pro according to the installation instructions you provided, and this issue no longer occurs.

QianZixi commented 1 month ago

I also encountered new issues while using SVision-pro. I am able to obtain normal vcf output when using CCS's HG002, HG003, and HG004 BAM files. But when I use simulated data generated through VISOR+lrsim+Minimap2, there is a situation where I cannot find any sv points at all. However, the BAM file generated by this simulation can obtain normal VCF output using methods such as Cutesv and Sinffles2. The following are the logs of running SVision-pro 2024-07-17 15:01:21,473 [INFO] ****************** Step1 collecting and plotting ********************** 2024-07-17 15:02:02,657 [INFO] Collecting sample1 1_0_10000000, 0 candidate events found. Time cost 40s 2024-07-17 15:02:41,693 [INFO] Collecting sample1 1_10000000_20000000, 0 candidate events found. Time cost 39s 2024-07-17 15:03:19,933 [INFO] Collecting sample1 1_20000000_30000000, 0 candidate events found. Time cost 38s 2024-07-17 15:03:58,521 [INFO] Collecting sample1 1_30000000_40000000, 0 candidate events found. Time cost 38s 2024-07-17 15:04:37,283 [INFO] Collecting sample1 1_40000000_50000000, 0 candidate events found. Time cost 38s 2024-07-17 15:05:17,895 [INFO] Collecting sample1 1_50000000_60000000, 0 candidate events found. Time cost 40s 2024-07-17 15:05:57,390 [INFO] Collecting sample1 1_60000000_70000000, 0 candidate events found. Time cost 39s 2024-07-17 15:06:37,067 [INFO] Collecting sample1 1_70000000_80000000, 0 candidate events found. Time cost 39s 2024-07-17 15:07:18,489 [INFO] Collecting sample1 1_80000000_90000000, 0 candidate events found. Time cost 41s 2024-07-17 15:07:57,150 [INFO] Collecting sample1 1_90000000_100000000, 0 candidate events found. Time cost 38s 2024-07-17 15:08:37,000 [INFO] Collecting sample1 1_100000000_110000000, 0 candidate events found. Time cost 39s 2024-07-17 15:09:17,404 [INFO] Collecting sample1 1_110000000_120000000, 0 candidate events found. Time cost 40s 2024-07-17 15:09:23,285 [INFO] Collecting sample1 1_120000000_130000000, 0 candidate events found. Time cost 5s 2024-07-17 15:09:23,407 [INFO] Collecting sample1 1_130000000_140000000, 0 candidate events found. Time cost 0s 2024-07-17 15:09:51,127 [INFO] Collecting sample1 1_140000000_150000000, 0 candidate events found. Time cost 27s 2024-07-17 15:10:28,830 [INFO] Collecting sample1 1_150000000_160000000, 0 candidate events found. Time cost 37s 2024-07-17 15:11:09,129 [INFO] Collecting sample1 1_160000000_170000000, 0 candidate events found. Time cost 40s 2024-07-17 15:11:48,849 [INFO] Collecting sample1 1_170000000_180000000, 0 candidate events found. Time cost 39s 2024-07-17 15:12:28,300 [INFO] Collecting sample1 1_180000000_190000000, 0 candidate events found. Time cost 39s Here are three reads from the content of the BAM file I am using. Simulated_2cb4169efa93f8ea_83_25027 2048 1 10001 15 3146H30M2D17M3I9M2I25M3D60M1D30M1I13M1I3M1I3M1I2M1D3M2I3M1I2M1I16M1I10M1I6M2D4M2I2M1I4M1I5M2D17M1D7M5I10M1I2M2I15M2I8M1I22M1D3M2I2M1D17M5I18M3I2M2I2M1I12M4I19M3I9M4I6M2D8M2D12M1I15M2D6M1D4M6D7M4D35M1I9M1I31M30D2M2I10M2I4M6D3M1D5M1I18M3D10M3I4M2I1M3I2M2I17M1D17M2I4M15I5M3I18M4D1M1D4M1D5M3D5M1D1M4D16M1D4M1D1M1D9M875H * 0 0 TAACCCTTAACCTAACCGGTACCCTAACCCACCCTACCCCCTTCCCTAACAACCCTAACCTCCTACCTCGAGCCCTAACCCTAACCACCCTAACCCTAACCCTAACCCAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCCTAACCACCAGACCACTACCTGCTAGACAACTACCTCCAGACCTACCCCCTAACCCACCAACCAACCTACTGAAACACCTAACCTAAACCTAATCCTAACCTAACCATCATCTAACCACAAGCCTACCACCCTCAACCCCATCAACCCACCACCCTACCCTAACACCTAACCCTACCAACTACCCTACCCTATCCCTACCTATAACCTAACCCTAAGCCTACCTACATCCGTAAGCCCATACCATAACCTAACCCTTAACCTAACCTAACCAACCCTACCCTACCCTACCTAACCCAACCCTAACCCTAAACCCTAACCCTACCTATCCTAGACCCTAACCCTAACCTAACCCTAACCCTAACCCTCCCTAACCCTAATCACCTAACACCTAACCCTAACCCCACCCAACCCAACCCTAACCCGCAACCCAGAACCCTGACCCCTGACCCCTGACCCGACCCCGACCCCGCCCCGAACCCTGACCTATGCCTAAGCGCAGAGAGGCGCGCGCGCCGCCGCAGGCGCGTAGAGGCCGCTGCGGACGGCAGGCGGGCCGCCGCGCCGGCGCAGGCAAGGCCGCCGCGGCGACAGAGAGGCGCCCAGCCCGGGAGGCGCAGA &&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&& NM:i:284 ms:i:392 AS:i:176 nn:i:0 tp:A:P cm:i:8 s1:i:65 s2:i:0 de:f:0.2205 SA:Z:20,62918053,-,1510S933M95I2286S,1,279;5,12627,+,3824S814M71D186S,46,220; rl:i:283 Simulated_2cb4169efa93f8ea_67_40409 256 1 10001 0 87S17M2I1M1I3M1I6M1D19M1I4M1D5M1D17M1D3M1D5M1D10M1D9M2I4M1I6M1D14M3I8M1I11M1D41M7D7M1D3M1D7M1D35M1D8M1D11M1D3M1D27M2I9M2D11M1D5M2D8M2D1M1D11M1I14M2D15M1I6M5I15M1D5M1I5M2D18M2I6M2D12M45026S * 0 0 * * NM:i:84 ms:i:457 AS:i:434 nn:i:0 tp:A:S cm:i:8 s1:i:60 de:f:0.1333 rl:i:1375 Simulated_2cb4169efa93f8ea_95_124251 2048 1 10001 11 5127H7M1D6M2I3M2I14M1D10M1D7M3I5M1D7M1D44M2I15M1I9M1D15M3I7M2D51M2I7M2D4M1I2M4I9M2I20M3D5M1I4M5I3M1I7M2I2M1I8M1D4M2I7M2I11M2I3M3I9M3I7M1I1M1I14M1I6M9I5M5I7M1I8M1I5M2I10M2D7M4I13M3I3M1D7M1D5M1I9M1D3M1I34M1D2M3D7M5D11M4D16M1D7M1D2M2D7M2D13M8D6M2I2M1D3M2I5M1I2M4D6M3D2M3I7M6D5M2I11M1D5M1I2M1D5M1D11M1D2M1D6M1D6M1D1M1D6M1I28M16745H * 0 0 TAACCCTACCCTACCACCAACTAAACCTAACCCTACCCTAACCCAACCCAACCTACCCTACCCTAACCTAACCCTAACCCTAACCCTAACCCTAACCCTATCCCTAACCCTGAACCCTAACCCTAACTGCTAACCCTACCCTAACCCTAACCTAACCTAACCAACCCTAACCATAACCCTAACCTAACCCTAACCCTAACCGTAACCCTAACCGACTAACGCACCCGTACCGTACCCTACCCTACCTAACCCTAACCCTAAACCAACCCATAACATACACCTGAACCCTAATACAACTAGCCCAAGCTACCTACCCAACAAACCCTACCTACCACCTACCCTAACCTAACCTACCCATGTACCCGTAACCCTATACCCTACATATAACTACCCTACTACACCCTAAGCCCTAACCGCTAACTGACTAATCCTACCTAACCTGAACCTAACCGGTAACCGACCTACCCTAACCTAACCCCTAACCCAACTCCTAATCCCTAACCCTAACCCTAACCCTAACCCTACACCCTAACCCACCCAACCCTAACCCTACCCTAACACCCTACCCCAACGTAACCAATACCTCTGAACGCACGGCCCGGAGCCACAGCCACACCTCACACGCACTGACCCATGCCTCAGGCGCTCAGAGAGCCGCCGGCCGGCGCAGGGCGAGAGGGCGCCGGCGGCGCTAGGCGCAGAGAGGCGCGCCGCGCCGGCG &&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&& NM:i:212 ms:i:440 AS:i:318 nn:i:0 tp:A:P cm:i:6 s1:i:50 s2:i:0 de:f:0.1862 SA:Z:5,12653,+,5766S16827M2371D,60,4646;18,10001,+,4187S509M16D17897S,1,119;12,95155,-,17207S515M42I4829S,13,156; rl:i:593 Do you know what caused this problem? Thank you!

songbowang125 commented 1 month ago

It appears that all your simulated reads are under 20 map quality. And SVision-pro filters out reads with map quality less than 20. So, try to use '--min_mapq 0' for calling from these reads.

By the way, why do your reads have such low map qualities and base qualities. Typically in real sequencing conditions, these qualities would not be that low. You may need to adjust your read simulator (lrsim?).

QianZixi commented 1 month ago

Thanks a lot. This is indeed a problem I have when using the simulator. Thank you for your help.

QianZixi commented 1 month ago

I tried setting '--min_mapq' to 0, but it still didn't solve the problem. I tried other parameters until the problem was solved by setting '--skip-umismatch_filter'. What is the function of this parameter? If this parameter is selected, what impact will it have on the results? Thanks a lot!

songbowang125 commented 1 month ago

Due to your simulated but error-prone reads, this '--skip-mismatch_filter' will discard these reads since there are too many Xs, Is and Ds in the CIGAR string. Why not try pbsim2 for read simulation? This tool can simulate the real error rate as much as possible.

QianZixi commented 1 month ago

Okay, I will try other simulators later. Thank you for your help.