schmeing / gapless

Gapless provides combined scaffolding, gap-closing and assembly correction with long reads
MIT License
32 stars 4 forks source link

pipeline crashed : scaffold (Mainpath purely formed out of deletions discovered) #6

Closed grpiccoli closed 1 year ago

grpiccoli commented 1 year ago

Hi Stephan, I have tried running the pipeline over the soft masked, masked and unmasked versions of the genome, and also compiled it from the source code and from the latest release, all to the same result

The command I'm running is: gapless.sh -j 32 -i ref.fasta -t pb_hifi hifi-reads.fastq.gz

I also tried running the uncompressed reads but it didn't help either

`/nesi/nobackup/vuw03529/bin/gapless/gapless.py:245: UserWarning:

distplot is a deprecated function and will be removed in seaborn v0.14.0.

Please adapt your code to use either displot (a figure-level function with similar flexibility) or histplot (an axes-level function for histograms).

For a guide to updating your code to use the new functions, please see https://gist.github.com/mwaskom/de44147ed2974457ad6372750bbe5751

ax = sns.distplot(values, bins=100, kde=False) /nesi/nobackup/vuw03529/bin/gapless/gapless.py:245: UserWarning:

distplot is a deprecated function and will be removed in seaborn v0.14.0.

Please adapt your code to use either displot (a figure-level function with similar flexibility) or histplot (an axes-level function for histograms).

For a guide to updating your code to use the new functions, please see https://gist.github.com/mwaskom/de44147ed2974457ad6372750bbe5751

ax = sns.distplot(values, bins=100, kde=False) gapless.py:253: UserWarning: FixedFormatter should only be used together with FixedLocator ax.set(xticklabels=np.where(locs.astype(int) == locs, (10 ** locs).astype(str), "")) gapless/gapless.py:245: UserWarning:

distplot is a deprecated function and will be removed in seaborn v0.14.0.

Please adapt your code to use either displot (a figure-level function with similar flexibility) or histplot (an axes-level function for histograms).

For a guide to updating your code to use the new functions, please see https://gist.github.com/mwaskom/de44147ed2974457ad6372750bbe5751

ax = sns.distplot(values, bins=100, kde=False) gapless.py:253: UserWarning: FixedFormatter should only be used together with FixedLocator ax.set(xticklabels=np.where(locs.astype(int) == locs, (10 ** locs).astype(str), "")) gapless.py:245: UserWarning:

distplot is a deprecated function and will be removed in seaborn v0.14.0.

Please adapt your code to use either displot (a figure-level function with similar flexibility) or histplot (an axes-level function for histograms).

For a guide to updating your code to use the new functions, please see https://gist.github.com/mwaskom/de44147ed2974457ad6372750bbe5751

ax = sns.distplot(values, bins=100, kde=False) gapless.py:253: UserWarning: FixedFormatter should only be used together with FixedLocator ax.set(xticklabels=np.where(locs.astype(int) == locs, (10 ** locs).astype(str), "")) gapless.py:3458: PerformanceWarning: DataFrame is highly fragmented. This is usually the result of calling frame.insert many times, which has poor performance. Consider joining all columns at once using pd.concat(axis=1) instead. To get a de-fragmented frame, use newframe = frame.copy() loops['truncate'] = True 0:00:07.623672 Reading in original assembly 0:00:09.723961 Loading repeats 0:00:43.890653 Filtering mappings 0:02:06.050920 Search for possible break points 1:44:32.297691 Search for possible bridges 1:46:59.480573 Scaffold the contigs Start 80668 Iteration 1 0:00:07.748962 49514 0:02:42.736140 34790 Iteration 2 0:00:01.830496 33713 0:01:15.638771 31853 Iteration 3 0:00:01.322932 31785 0:00:50.008133 31584 Iteration 4 0:00:01.083436 31577 0:00:45.240387 31533 Iteration 5 0:00:00.972947 31531 0:00:42.958803 31517 Iteration 6 0:00:00.853825 31517 0:00:41.671365 31515 Iteration 7 0:00:00.871425 31515 0:00:41.411202 31514 Iteration 8 0:00:00.871560 31514 0:00:41.205733 31514 RemoveDuplicates 31159 Iteration 1 0:00:00.871014 31159 0:00:46.830285 30962 Iteration 2 0:00:01.137579 30946 0:00:44.972976 30901 Iteration 3 0:00:00.907155 30900 0:00:40.847423 30891 Iteration 4 0:00:00.902250 30890 0:00:41.089375 30888 Iteration 5 0:00:00.854656 30888 0:00:40.724487 30888 PlaceUnambigouslyPlaceables 30339 Iteration 1 0:00:00.857167 30339 0:00:58.587446 29966 Iteration 2 0:00:01.064796 29943 0:00:50.846228 29890 Iteration 3 0:00:00.898226 29888 0:00:46.759266 29879 Iteration 4 0:00:00.828450 29879 0:00:46.684630 29879 CombineOnMatchingExtensions 27509 TrimAmbiguousOverlap 18433 TrimCircularPaths 18420 pid pos phase0 scaf0 ... dist1 phase_dir deletion new_phase 21619 61694 0 0 -1 ... 0 1 True 0 21620 61694 1 0 -1 ... 3192 -1 True 0 21621 61694 2 27467 43536 ... 0 1 False 27467 21622 61694 3 27469 43537 ... 774 1 False 27469 21623 61694 4 27471 43538 ... 356 1 False 27471 21624 61694 5 27473 43539 ... 0 1 False 27473 21625 61694 6 27473 43540 ... 0 -1 False 27473 26991 81037 0 0 -1 ... 0 1 True 0 26992 81037 1 0 -1 ... -280 -1 True 0 26993 81037 2 36049 3043 ... 0 1 False 36049 26994 81037 3 36049 3045 ... 0 -1 False 36049 26995 81037 4 36051 3046 ... 0 1 False 36051 26996 81037 5 36051 3047 ... 392 -1 False 36051 41884 94590 0 0 -1 ... 0 1 True 0 41885 94590 1 0 -1 ... 0 1 True 0 41886 94590 2 0 -1 ... 0 1 True 0 41887 94590 3 0 -1 ... 0 -1 True 0 41888 94590 4 61991 35545 ... 0 1 False 61991 41889 94590 5 61991 35546 ... 0 -1 False 61991 41890 94590 6 61993 35547 ... 257 -1 False 61993

[20 rows x 13 columns] Traceback (most recent call last): File "gapless.py", line 13327, in main(sys.argv[1:]) File "gapless.py", line 13156, in main GaplessScaffold(args[0], args[1], args[2], min_mapq, min_mapping_length, min_length_contig_break, prefix, stats) File "gapless.py", line 9101, in GaplessScaffold scaffold_paths, trim_repeats = ScaffoldContigs(contig_parts, bridges, mappings, cov_probs, repeats, prob_factor, min_mapping_length, max_dist_contig_end, prematurity_threshold, ploidy, max_loop_units) File "gapless.py", line 7845, in ScaffoldContigs scaffold_paths = PhaseScaffolds(scaffold_paths, graph_ext, scaf_bridges, ploidy) File "gapless.py", line 7671, in PhaseScaffolds scaffold_paths = PhaseScaffoldsWithScafBridges(scaffold_paths, scaf_bridges, ploidy) File "gapless.py", line 7520, in PhaseScaffoldsWithScafBridges scaffold_paths = AssignDeletionsToNeighbouringPhase(scaffold_paths, ploidy) File "gapless.py", line 7475, in AssignDeletionsToNeighbouringPhase raise RuntimeError("Mainpath purely formed out of deletions discovered.") RuntimeError: Mainpath purely formed out of deletions discovered.`

Thank you for your help

schmeing commented 1 year ago

Hi grpiccoli,

Thank you for discovering and posting this issue. Since the path have multiple haplotypes they can have deletions in some of the haplotypes, but somehow the path ended up to be only be formed out of deletions, meaning there is no path in the first place.

Is there any chance you could share the gapless_split.fa, gapless_reads.paf and gapless_split_repeats.paf with me on stephan.schmeing@uzh.ch? That would speed up the backtracing of the error. I need to backtrace the three affected path 61694, 81037 and 94590 through the AssignDeletionsToNeighbouringPhase function and maybe in general the PhaseScaffoldsWithScafBridges function to understand what is going wrong where, so I can fix it.

Alternatively, you could print the dataframe filtered on pid for the path numbers above at various stages of the two functions and I can go through that to figure out the problem.

Thanks, Stephan

alexvasilikop commented 1 year ago

Hello,

I am facing the same issue. Is there any progress concerning the reason for this error? Thanks Alex

/home/lege/anaconda3/envs/gapless/bin/gapless.py:245: UserWarning:

distplot is a deprecated function and will be removed in seaborn v0.14.0.

Please adapt your code to use either displot (a figure-level function with similar flexibility) or histplot (an axes-level function for histograms).

For a guide to updating your code to use the new functions, please see https://gist.github.com/mwaskom/de44147ed2974457ad6372750bbe5751

ax = sns.distplot(values, bins=100, kde=False) /home/lege/anaconda3/envs/gapless/bin/gapless.py:245: UserWarning:

distplot is a deprecated function and will be removed in seaborn v0.14.0.

Please adapt your code to use either displot (a figure-level function with similar flexibility) or histplot (an axes-level function for histograms).

For a guide to updating your code to use the new functions, please see https://gist.github.com/mwaskom/de44147ed2974457ad6372750bbe5751

ax = sns.distplot(values, bins=100, kde=False) /home/lege/anaconda3/envs/gapless/bin/gapless.py:253: UserWarning: FixedFormatter should only be used together with FixedLocator ax.set(xticklabels=np.where(locs.astype(int) == locs, (10 ** locs).astype(str), "")) /home/lege/anaconda3/envs/gapless/bin/gapless.py:245: UserWarning:

distplot is a deprecated function and will be removed in seaborn v0.14.0.

Please adapt your code to use either displot (a figure-level function with similar flexibility) or histplot (an axes-level function for histograms).

For a guide to updating your code to use the new functions, please see https://gist.github.com/mwaskom/de44147ed2974457ad6372750bbe5751

ax = sns.distplot(values, bins=100, kde=False) /home/lege/anaconda3/envs/gapless/bin/gapless.py:253: UserWarning: FixedFormatter should only be used together with FixedLocator ax.set(xticklabels=np.where(locs.astype(int) == locs, (10 ** locs).astype(str), "")) /home/lege/anaconda3/envs/gapless/bin/gapless.py:245: UserWarning:

distplot is a deprecated function and will be removed in seaborn v0.14.0.

Please adapt your code to use either displot (a figure-level function with similar flexibility) or histplot (an axes-level function for histograms).

For a guide to updating your code to use the new functions, please see https://gist.github.com/mwaskom/de44147ed2974457ad6372750bbe5751

ax = sns.distplot(values, bins=100, kde=False) /home/lege/anaconda3/envs/gapless/bin/gapless.py:253: UserWarning: FixedFormatter should only be used together with FixedLocator ax.set(xticklabels=np.where(locs.astype(int) == locs, (10 ** locs).astype(str), "")) /home/lege/anaconda3/envs/gapless/bin/gapless.py:3458: PerformanceWarning: DataFrame is highly fragmented. This is usually the result of calling frame.insert many times, which has poor performance. Consider joining all columns at once using pd.concat(axis=1) instead. To get a de-fragmented frame, use newframe = frame.copy() loops['truncate'] = True /home/lege/anaconda3/envs/gapless/bin/gapless.py:4182: DeprecationWarning: In a future version, df.iloc[:, i] = newvals will attempt to set the values inplace instead of always setting a new array. To retain the old behavior, use either df[df.columns[i]] = newvals or, if columns are non-unique, df.isetitem(i, newvals) finished_paths.loc[finished_paths['len'] == l, f'bi{l}'] = finished_paths.loc[finished_paths['len'] == l, 'bito'] /home/lege/anaconda3/envs/gapless/bin/gapless.py:4262: DeprecationWarning: In a future version, df.iloc[:, i] = newvals will attempt to set the values inplace instead of always setting a new array. To retain the old behavior, use either df[df.columns[i]] = newvals or, if columns are non-unique, df.isetitem(i, newvals) indirect_conns.loc[used_units.loc[used_units['pos'] == u, 'cindex'].values, f'li{u}'] = used_units.loc[used_units['pos'] == u, 'lindex'].values /home/lege/anaconda3/envs/gapless/bin/gapless.py:6226: DeprecationWarning: In a future version, df.iloc[:, i] = newvals will attempt to set the values inplace instead of always setting a new array. To retain the old behavior, use either df[df.columns[i]] = newvals or, if columns are non-unique, df.isetitem(i, newvals) new_haps.loc[cur_switch, 'new_bhap'] = np.where(new_haps.loc[cur_switch, 'new_bhap'] == new_haps.loc[cur_switch, 'to_hap'], new_haps.loc[cur_switch, 'from_hap'], new_haps.loc[cur_switch, 'to_hap']) /home/lege/anaconda3/envs/gapless/bin/gapless.py:6226: DeprecationWarning: In a future version, df.iloc[:, i] = newvals will attempt to set the values inplace instead of always setting a new array. To retain the old behavior, use either df[df.columns[i]] = newvals or, if columns are non-unique, df.isetitem(i, newvals) new_haps.loc[cur_switch, 'new_bhap'] = np.where(new_haps.loc[cur_switch, 'new_bhap'] == new_haps.loc[cur_switch, 'to_hap'], new_haps.loc[cur_switch, 'from_hap'], new_haps.loc[cur_switch, 'to_hap']) /home/lege/anaconda3/envs/gapless/bin/gapless.py:6226: DeprecationWarning: In a future version, df.iloc[:, i] = newvals will attempt to set the values inplace instead of always setting a new array. To retain the old behavior, use either df[df.columns[i]] = newvals or, if columns are non-unique, df.isetitem(i, newvals) new_haps.loc[cur_switch, 'new_bhap'] = np.where(new_haps.loc[cur_switch, 'new_bhap'] == new_haps.loc[cur_switch, 'to_hap'], new_haps.loc[cur_switch, 'from_hap'], new_haps.loc[cur_switch, 'to_hap']) /home/lege/anaconda3/envs/gapless/bin/gapless.py:6226: DeprecationWarning: In a future version, df.iloc[:, i] = newvals will attempt to set the values inplace instead of always setting a new array. To retain the old behavior, use either df[df.columns[i]] = newvals or, if columns are non-unique, df.isetitem(i, newvals) new_haps.loc[cur_switch, 'new_bhap'] = np.where(new_haps.loc[cur_switch, 'new_bhap'] == new_haps.loc[cur_switch, 'to_hap'], new_haps.loc[cur_switch, 'from_hap'], new_haps.loc[cur_switch, 'to_hap']) 0:00:04.490900 Reading in original assembly 0:00:05.030241 Loading repeats 0:00:05.260642 Filtering mappings 0:00:27.666875 Search for possible break points 0:05:31.693244 Search for possible bridges 0:05:44.637506 Scaffold the contigs Start 5707 Iteration 1 0:00:03.074880 3189 0:01:33.227093 2112 Iteration 2 0:00:01.966563 1983 0:00:45.672598 1739 Iteration 3 0:00:01.614985 1712 0:00:33.802445 1654 Iteration 4 0:00:01.245741 1653 0:00:24.199406 1645 Iteration 5 0:00:01.027495 1645 0:00:23.999717 1643 Iteration 6 0:00:01.030966 1643 0:00:23.940900 1643 RemoveDuplicates 1584 Iteration 1 0:00:01.027211 1584 0:00:26.043970 1554 Iteration 2 0:00:01.465804 1547 0:00:23.938897 1542 Iteration 3 0:00:01.304622 1541 0:00:23.426922 1539 Iteration 4 0:00:01.115702 1539 0:00:23.535516 1539 PlaceUnambigouslyPlaceables 1501 Iteration 1 0:00:01.107717 1501 0:00:30.489399 1463 Iteration 2 0:00:01.244862 1461 0:00:25.670216 1456 Iteration 3 0:00:01.081861 1456 0:00:24.726531 1456 CombineOnMatchingExtensions 1267 TrimAmbiguousOverlap 896 TrimCircularPaths 895 pid pos phase0 scaf0 ... dist1 phase_dir deletion new_phase 3670 6548 0 0 -1 ... 0 1 True 0 3671 6548 1 0 -1 ... 395 1 True 0 3672 6548 2 0 -1 ... 16542 1 True 0 3673 6548 3 0 -1 ... 3262 1 True 0 3674 6548 4 0 -1 ... 8704 1 True 0 3675 6548 5 0 -1 ... 335 1 True 0 3676 6548 6 0 -1 ... 0 -1 True 0 3677 6548 7 4119 704 ... 0 1 False 4119 3678 6548 8 4121 706 ... 1533 1 False 4121 3679 6548 9 4123 708 ... 0 1 False 4123 3680 6548 10 4125 4 ... 0 1 False 4125 3681 6548 11 4125 4056 ... 0 1 False 4125 3682 6548 12 4125 711 ... 0 -1 False 4125 3683 6548 13 4131 714 ... 0 1 False 4131 3684 6548 14 4131 718 ... 0 -1 False 4131

[15 rows x 13 columns] Traceback (most recent call last): File "/home/lege/anaconda3/envs/gapless/bin/gapless.py", line 13327, in main(sys.argv[1:]) File "/home/lege/anaconda3/envs/gapless/bin/gapless.py", line 13156, in main GaplessScaffold(args[0], args[1], args[2], min_mapq, min_mapping_length, min_length_contig_break, prefix, stats) File "/home/lege/anaconda3/envs/gapless/bin/gapless.py", line 9101, in GaplessScaffold scaffold_paths, trim_repeats = ScaffoldContigs(contig_parts, bridges, mappings, cov_probs, repeats, prob_factor, min_mapping_length, max_dist_contig_end, prematurity_threshold, ploidy, max_loop_units) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/lege/anaconda3/envs/gapless/bin/gapless.py", line 7845, in ScaffoldContigs scaffold_paths = PhaseScaffolds(scaffold_paths, graph_ext, scaf_bridges, ploidy) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/lege/anaconda3/envs/gapless/bin/gapless.py", line 7671, in PhaseScaffolds scaffold_paths = PhaseScaffoldsWithScafBridges(scaffold_paths, scaf_bridges, ploidy) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/lege/anaconda3/envs/gapless/bin/gapless.py", line 7520, in PhaseScaffoldsWithScafBridges scaffold_paths = AssignDeletionsToNeighbouringPhase(scaffold_paths, ploidy) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/lege/anaconda3/envs/gapless/bin/gapless.py", line 7475, in AssignDeletionsToNeighbouringPhase raise RuntimeError("Mainpath purely formed out of deletions discovered.") RuntimeError: Mainpath purely formed out of deletions discovered.

schmeing commented 1 year ago

Unfortunately not. In order to fix this I need either the input files into gapless scaffold (gapless_split.fa, gapless_reads.paf and gapless_split_repeats.paf) or some intermediate output for which you need to modify gapless.py and sent me the new output:

To start you need to add code before and after line 7473: scaffold_paths = AssignNewPhases(scaffold_paths, test_bridges, ploidy) This should become:

pd.set_option('display.max_columns', 500)
pd.set_option('display.width', 1000)
print( scaffold_paths[np.isin(scaffold_paths['pid'], [6548])] )
scaffold_paths = AssignNewPhases(scaffold_paths, test_bridges, ploidy)
print( scaffold_paths[np.isin(scaffold_paths['pid'], [6548])] )

Then you should see three tables where you now see one:

pid pos phase0 scaf0 ... dist1 phase_dir deletion new_phase
3670 6548 0 0 -1 ... 0 1 True 0
3671 6548 1 0 -1 ... 395 1 True 0
3672 6548 2 0 -1 ... 16542 1 True 0
3673 6548 3 0 -1 ... 3262 1 True 0
3674 6548 4 0 -1 ... 8704 1 True 0
3675 6548 5 0 -1 ... 335 1 True 0
3676 6548 6 0 -1 ... 0 -1 True 0
3677 6548 7 4119 704 ... 0 1 False 4119
3678 6548 8 4121 706 ... 1533 1 False 4121
3679 6548 9 4123 708 ... 0 1 False 4123
3680 6548 10 4125 4 ... 0 1 False 4125
3681 6548 11 4125 4056 ... 0 1 False 4125
3682 6548 12 4125 711 ... 0 -1 False 4125
3683 6548 13 4131 714 ... 0 1 False 4131
3684 6548 14 4131 718 ... 0 -1 False 4131

And instead of the three dots you should see all columns.

To check which gapless.py you have to edit (in case you have multiple) you can use in the shell: type gapless.py

In case you do not see the three tables, you can verify that you execute the modified file by adding a print statement at the beginning of the main scaffolding function: To make sure your changes are effective you can put a print statement after the function definition of the main function for scaffolding:

def GaplessScaffold(assembly_file, mapping_file, repeat_file, min_mapq, min_mapping_length, min_length_contig_break, prefix=False, stats=None):
    print("I am running my modified version")

Then the statement should show up at the top of your log file.

Thank you.

grpiccoli commented 1 year ago

The issue still persists.

I am running my modified version 0:00:18.091160 Reading in original assembly 0:00:19.377244 Loading repeats 0:00:44.690528 Filtering mappings 0:01:35.033726 Search for possible break points 1:13:52.254578 Search for possible bridges 1:15:42.410280 Scaffold the contigs Start 85048 Iteration 1 0:00:04.036511 52792 0:02:21.335328 37418 Iteration 2 0:00:01.155578 36262 0:00:56.313257 34363 Iteration 3 0:00:00.912385 34292 0:00:38.280533 34051 Iteration 4 0:00:00.670088 34046 Iteration 5 0:00:00.682901 34012 0:00:32.734666 34003 Iteration 6 0:00:00.615418 34003 0:00:31.889040 34003 RemoveDuplicates 33605 Iteration 1 0:00:00.575563 33605 0:00:36.633607 33364 Iteration 2 0:00:00.746232 33348 0:00:33.453157 33298 Iteration 3 0:00:00.672599 33296 0:00:31.781668 33285 Iteration 4 0:00:00.704894 33284 0:00:31.724536 33282 Iteration 5 0:00:00.654464 33282 0:00:31.682262 33282 PlaceUnambigouslyPlaceables 32710 Iteration 1 0:00:00.574932 32710 0:00:41.635875 32321 Iteration 2 0:00:00.710325 32298 0:00:35.542819 32245 Iteration 3 0:00:00.574831 32245 0:00:33.248869 32241 Iteration 4 0:00:00.572843 32241 0:00:33.350225 32240 Iteration 5 0:00:00.572794 32240 0:00:33.215731 32240 CombineOnMatchingExtensions 29303 TrimAmbiguousOverlap 19190 TrimCircularPaths 19177 Empty DataFrame Columns: [pid, pos, phase0, scaf0, strand0, dist0, phase1, scaf1, strand1, dist1] Index: [] Empty DataFrame Columns: [pid, pos, phase0, scaf0, strand0, dist0, phase1, scaf1, strand1, dist1] Index: [] pid pos phase0 scaf0 strand0 dist0 phase1 scaf1 strand1 dist1 phase_dir deletion new_phase 2086 6111 0 0 -1 0 2280 12108 + 0 1 True 0 2087 6111 1 0 -1 0 2280 12109 + 0 -1 True 0 2088 6111 2 2283 12110 + 0 -2284 -1 0 1 False 2283 2089 6111 3 2283 12111 + 0 -2284 -1 0 1 False 2283 2090 6111 4 2283 12112 + 0 -2284 -1 0 1 False 2283 2091 6111 5 2283 12113 + 0 -2284 -1 0 -1 False 2283 2092 6111 6 2287 -1 0 2286 12113 + 0 1 True 2287 2093 6111 7 2287 12114 + 0 -2288 -1 0 -1 False 2287 42431 99141 0 0 -1 0 62078 56142 + 0 1 True 0 42432 99141 1 0 -1 0 62078 56143 + 0 1 True 0 42433 99141 2 0 -1 0 62078 56144 + 0 -1 True 0 42434 99141 3 62083 56145 + 0 -62084 -1 0 1 False 62083 42435 99141 4 62083 56146 + 0 -62084 -1 0 -1 False 62083 42436 99141 5 62085 56147 + 0 62086 -1 0 1 False 62085 42437 99141 6 62085 56148 + 0 62088 56148 + -41 -1 False 62085 42438 99141 7 62089 56149 + 0 62090 56149 + 1711 -1 False 62089 42777 99372 0 0 -1 0 62644 20143 + 0 1 True 0 42778 99372 1 0 -1 0 62644 20144 + 0 -1 True 0 42779 99372 2 62647 20145 + 0 -62648 -1 0 1 False 62647 42780 99372 3 62649 20146 + 510 62650 -1 0 1 False 62649 42781 99372 4 62649 20147 + 0 62652 20147 + 1347 -1 False 62649 51271 116236 0 0 -1 0 69468 31617 + 0 1 True 0 51272 116236 1 0 -1 0 69468 31618 + 194 -1 True 0 51273 116236 2 69471 31619 + 0 -69472 -1 0 1 False 69471 51274 116236 3 69475 -1 0 69474 31620 + 0 1 True 69475 51275 116236 4 69475 31621 + 30 69474 31621 + 0 -1 False 69475 51276 116236 5 69477 31622 + 0 -69478 -1 0 1 False 69477 51277 116236 6 69477 31623 + 0 -69478 -1 0 1 False 69477 51278 116236 7 69477 31625 + 0 -69478 -1 0 1 False 69477 51279 116236 8 69477 31626 + 0 -69478 -1 0 1 False 69477 51280 116236 9 69477 31627 + 0 -69478 -1 0 1 False 69477 51281 116236 10 69477 31628 + 0 -69478 -1 0 1 False 69477 51282 116236 11 69477 31629 + 0 -69478 -1 0 1 False 69477 51283 116236 12 69477 31630 + 0 -69478 -1 0 1 False 69477 51284 116236 13 69477 31631 + 0 -69478 -1 0 1 False 69477 51285 116236 14 69477 31632 + 0 -69478 -1 0 1 False 69477 51286 116236 15 69477 31633 + 0 -69478 -1 0 1 False 69477 51287 116236 16 69477 31634 + 0 -69478 -1 0 -1 False 69477 Traceback (most recent call last): File "/nesi/nobackup/vuw03529/bin/gapless/gapless.py", line 13373, in main(sys.argv[1:]) File "/nesi/nobackup/vuw03529/bin/gapless/gapless.py", line 13200, in main GaplessScaffold(args[0], args[1], args[2], min_mapq, min_mapping_length, min_length_contig_break, True, large_contigs, prefix, stats) File "/nesi/nobackup/vuw03529/bin/gapless/gapless.py", line 9132, in GaplessScaffold scaffold_paths, trim_repeats = ScaffoldContigs(contig_parts, bridges, mappings, cov_probs, repeats, prob_factor, min_mapping_length, max_dist_contig_end, prematurity_threshold, ploidy, max_loop_units) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/nesi/nobackup/vuw03529/bin/gapless/gapless.py", line 7875, in ScaffoldContigs scaffold_paths = PhaseScaffolds(scaffold_paths, graph_ext, scaf_bridges, ploidy) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/nesi/nobackup/vuw03529/bin/gapless/gapless.py", line 7701, in PhaseScaffolds scaffold_paths = PhaseScaffoldsWithScafBridges(scaffold_paths, scaf_bridges, ploidy) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/nesi/nobackup/vuw03529/bin/gapless/gapless.py", line 7550, in PhaseScaffoldsWithScafBridges scaffold_paths = AssignDeletionsToNeighbouringPhase(scaffold_paths, ploidy) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/nesi/nobackup/vuw03529/bin/gapless/gapless.py", line 7496, in AssignDeletionsToNeighbouringPhase raise RuntimeError("Mainpath purely formed out of deletions discovered.") RuntimeError: Mainpath purely formed out of deletions discovered.