hillerlab / make_lastz_chains

Portable solution to generate genome alignment chains using lastz
MIT License
44 stars 8 forks source link

Pipeline failed at last clean_chains step #44

Open Alchimic007 opened 9 months ago

Alchimic007 commented 9 months ago

Hello, got an error at last Clean chains step: ### Clean Chains Step ###

Chains were filled: using /users/hpctestuser/mprylutskyi/makechains/make_lastz_chains/deschrambler_test_files/temp_chain_run/human.cow.filled.chain.gz as input Chain to be cleaned saved to: /users/hpctestuser/mprylutskyi/makechains/make_lastz_chains/deschrambler_test_files/temp_chain_run/human.cow.before_cleaning.chain.gz An error occurred while executing clean_chains: 'str' object has no attribute 'removesuffix'

I understand that script has some problems with .gz suffix in chain file, but I do not understand how to solve it. Here is server log Chains were filled: using /users/hpctestuser/mprylutskyi/makechains/make_lastz_chains/deschrambler_test_files/temp_chain_run/human.cow.filled.chain.gz as input Chain to be cleaned saved to: /users/hpctestuser/mprylutskyi/makechains/make_lastz_chains/deschrambler_test_files/temp_chain_run/human.cow.before_cleaning.chain.gz An error occurred while executing clean_chains: 'str' object has no attribute 'removesuffix' Traceback (most recent call last): File "/users/hpctestuser/mprylutskyi/makechains/make_lastz_chains/modules/step_manager.py", line 70, in execute_steps step_result = step_to_function[step](params, project_paths, step_executables) File "/users/hpctestuser/mprylutskyi/makechains/make_lastz_chains/modules/pipeline_steps.py", line 88, in clean_chains_step do_chains_clean(params, project_paths, executables) File "/users/hpctestuser/mprylutskyi/makechains/make_lastz_chains/steps_implementations/clean_chain_step.py", line 31, in do_chains_clean _output_chain = input_chain.removesuffix(".gz") AttributeError: 'str' object has no attribute 'removesuffix'

SGE job completed on Wed Nov 29 17:47:50 GMT 2023

I would be grateful for any help

MichaelHiller commented 9 months ago

Hi,

chainCleaner actually works with unzipped or gzipped input chain files. So this is not the problem. Can you pls run it separately as chainCleaner $input.chain.gz $reference.2bit $query.2bit $output.cleaned.chain removedSuspects.bed -linearGap=loose -tSizes=$reference.chrom.sizes -qSizes=$query.chrom.sizes -LRfoldThreshold=2.5 -doPairs -LRfoldThresholdPairs=10 -maxPairDistance=10000 -maxSuspectScore=100000 -minBrokenChainScore=75000

Alchimic007 commented 9 months ago

it seems chainCleaner doesn't see it's dependency NetFilterNonNested.perl

chainCleaner /users/hpctestuser/mprylutskyi/makechains/make_lastz_chains/human.cow.before_cleaning.chain.gz /users/hpctestuser/mprylutskyi/makechains/make_lastz_chains/deschrambler_test_files/target.2bit /users/hpctestuser/mprylutskyi/makechains/make_lastz_chains/deschrambler_test_files/query.2bit /users/hpctestuser/mprylutskyi/makechains/make_lastz_chains/human.cow.cleaned.chain removedSuspects.bed -linearGap=loose -tSizes=/users/hpctestuser/mprylutskyi/makechains/make_lastz_chains/deschrambler_test_files/target.chrom.sizes -qSizes=/users/hpctestuser/mprylutskyi/makechains/make_lastz_chains/deschrambler_test_files/query.chrom.sizes -LRfoldThreshold=2.5 -doPairs -LRfoldThresholdPairs=10 -maxPairDistance=10000 -maxSuspectScore=100000 -minBrokenChainScore=75000 Verbosity level: 1 foldThreshold: 0.000000 LRfoldThreshold: 2.500000 maxSuspectBases: 2147483647 maxSuspectScore: 100000 minBrokenChainScore: 75000 minLRGapSize: 0 doPairs with LRfoldThreshold: 10.000000 maxPairDistance 10000 which: no NetFilterNonNested.perl in (/users/hpctestuser/miniconda2/envs/py37/bin:/users/hpctestuser/edirect:/users/hpctestuser/miniconda2/condabin:/users/hpctestuser/yurchenko/PIPELINE_DCMS/DCMS2/APP/miniconda2/bin:/users/hpctestuser/yurchenko/WGS_COW/POP_STRUCTURE/FILTERING/driver/easySFS-master:/users/hpctestuser/yurchenko/APP/admixture_linux-1.3.0:/users/hpctestuser/yurchenko/APP/seqtk-1.3:/users/hpctestuser/yurchenko/APP/rapidNJ/bin:/users/hpctestuser/yurchenko/APP/standard-RAxML-8.2.12:/users/hpctestuser/yurchenko/APP/PLINK:/users/hpctestuser/yurchenko/APP/htslib19/bin:/users/hpctestuser/yurchenko/APP/bcftools19/bin:/users/hpctestuser/yurchenko/APP/samtools19/bin:/users/hpctestuser/yurchenko/APP/vcftools16/bin:/usr/lib64/qt-3.3/bin:/opt/service/gridscheduler/2011.11p1_155/bin/linux-x64:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/users/hpctestuser/bin:/users/hpctestuser/mprylutskyi/makechains/make_lastz_chains/HL_kent_binaries/NetFilterNonNested.perl:/users/hpctestuser/mprylutskyi/makechains/make_lastz_chains/HL_kent_binaries/NetFilterNonNested.perl) ERROR: NetFilterNonNested.perl (comes with the chainCleaner source code) is not a binary in $PATH. Either install it or provide the nets as input.

MichaelHiller commented 9 months ago

OK, then the installation was incomplete, but this is easy to fix. Pls git clone this https://github.com/hillerlab/GenomeAlignmentTools and add all tools to your $PATH. Then which NetFilterNonNested.perl should work.

shuifeng1988 commented 5 months ago

I also got an error at last Clean chains step! No result file (/gpfs/home/mays/git/make_lastz_chains/test_out/temp_chain_run/target.query.filled.chain.gz) generation.

Nextflow process fill_chain finished successfully

Merging filled chains Executing the following sequence of commands in a pipe: ['find', '/gpfs/home/mays/git/make_lastz_chains/test_out/temp_fill_chain/filled_chain_files', '-type', 'f', '-name', '*.chain', '-print'] ['/gpfs/home/mays/soft/rnacocktail_raw/ucsc_tools/chainMergeSort', '-inputList=stdin', '-tempDir=/gpfs/home/mays/git/make_lastz_chains/test_out/temp_kent'] ['gzip', '-c'] . Merging filled chains done Fill chains step complete

Clean Chains Step

Chains were filled: using /gpfs/home/mays/git/make_lastz_chains/test_out/temp_chain_run/target.query.filled.chain.gz as input Chain to be cleaned saved to: /gpfs/home/mays/git/make_lastz_chains/test_out/temp_chain_run/target.query.before_cleaning.chain.gz An error occurred while executing clean_chains: 'str' object has no attribute 'removesuffix' Traceback (most recent call last): File "/gpfs/home/mays/git/make_lastz_chains/modules/step_manager.py", line 70, in execute_steps step_result = step_to_function[step](params, project_paths, step_executables) File "/gpfs/home/mays/git/make_lastz_chains/modules/pipeline_steps.py", line 88, in clean_chains_step do_chains_clean(params, project_paths, executables) File "/gpfs/home/mays/git/make_lastz_chains/steps_implementations/clean_chain_step.py", line 31, in do_chains_clean _output_chain = input_chain.removesuffix(".gz") AttributeError: 'str' object has no attribute 'removesuffix'

However, I follow Alchimic007's command (chainCleaner /gpfs/home/mays/git/make_lastz_chains/test_out/temp_chain_run/target.query.before_cleaning.chain.gz ./test_out/target.2bit ./test_out/query.2bit /gpfs/home/mays/git/make_lastz_chains/test_out/temp_chain_run/target.query.filled.chain.gz removedSuspects.bed -linearGap=loose -tSizes=./test_out/target.chrom.sizes -qSizes=./test_out/query.chrom.sizes -LRfoldThreshold=2.5 -doPairs -LRfoldThresholdPairs=10 -maxPairDistance=10000 -maxSuspectScore=100000 -minBrokenChainScore=75000) to do the chainCleaner, It shows Ok, the output file generated successfully, I don't know why?

Verbosity level: 1 foldThreshold: 0.000000 LRfoldThreshold: 2.500000 maxSuspectBases: 2147483647 maxSuspectScore: 100000 minBrokenChainScore: 75000 minLRGapSize: 0 doPairs with LRfoldThreshold: 10.000000 maxPairDistance 10000

  1. need to net the input chains /gpfs/home/mays/git/make_lastz_chains/test_out/temp_chain_run/target.query.before_cleaning.chain.gz (no net file given) ... tempfile for netting: tmp.chainCleaner.XonZACT.net Got 2 chroms in ./test_out/target.chrom.sizes, 2 in ./test_out/query.chrom.sizes Finishing nets writing stdout writing /dev/null DONE (nets in tmp.chainCleaner.XonZACT.net)
  2. parsing fills/gaps from tmp.chainCleaner.XonZACT.net and getting valid breaks ... 1.1 read net file tmp.chainCleaner.XonZACT.net into memory ... DONE

1.2 get fills/gaps from tmp.chainCleaner.XonZACT.net ... DONE

1.3 get aligning regions from tmp.chainCleaner.XonZACT.net ... DONE

1.4 get valid breaks ... DONE Remove temporary netfile tmp.chainCleaner.XonZACT.net DONE (parsing fills/gaps and getting valid breaks)

  1. reading breaking and broken chains from /gpfs/home/mays/git/make_lastz_chains/test_out/temp_chain_run/target.query.before_cleaning.chain.gz and write irrelevant chains to /gpfs/home/mays/git/make_lastz_chains/test_out/temp_chain_run/target.query.filled.chain.gz.unsorted ... DONE

  2. reading target and query DNA sequences for breaking and broken chains ... DONE

  3. loop over all breaks. Remove suspects if they pass our filters and write out deleted suspects to removedSuspects.bed ... DONE

  4. write the (new) breaking and the broken chains to /gpfs/home/mays/git/make_lastz_chains/test_out/temp_chain_run/target.query.filled.chain.gz.unsorted ... DONE

  5. chainSort /gpfs/home/mays/git/make_lastz_chains/test_out/temp_chain_run/target.query.filled.chain.gz.unsorted /gpfs/home/mays/git/make_lastz_chains/test_out/temp_chain_run/target.query.filled.chain.gz ... DONE

  6. free memory ... DONE

memory usage 58077184, utime 0 s/100, stime 0

ALL DONE. New chains are in /gpfs/home/mays/git/make_lastz_chains/test_out/temp_chain_run/target.query.filled.chain.gz. Deleted suspects in removedSuspects.bed

shuifeng1988 commented 5 months ago

I also got an error at last Clean chains step! No result file (/gpfs/home/mays/git/make_lastz_chains/test_out/temp_chain_run/target.query.filled.chain.gz) generation.

Nextflow process fill_chain finished successfully

Merging filled chains Executing the following sequence of commands in a pipe: ['find', '/gpfs/home/mays/git/make_lastz_chains/test_out/temp_fill_chain/filled_chain_files', '-type', 'f', '-name', '*.chain', '-print'] ['/gpfs/home/mays/soft/rnacocktail_raw/ucsc_tools/chainMergeSort', '-inputList=stdin', '-tempDir=/gpfs/home/mays/git/make_lastz_chains/test_out/temp_kent'] ['gzip', '-c'] . Merging filled chains done Fill chains step complete

Clean Chains Step

Chains were filled: using /gpfs/home/mays/git/make_lastz_chains/test_out/temp_chain_run/target.query.filled.chain.gz as input Chain to be cleaned saved to: /gpfs/home/mays/git/make_lastz_chains/test_out/temp_chain_run/target.query.before_cleaning.chain.gz An error occurred while executing clean_chains: 'str' object has no attribute 'removesuffix' Traceback (most recent call last): File "/gpfs/home/mays/git/make_lastz_chains/modules/step_manager.py", line 70, in execute_steps step_result = step_to_function[step](params, project_paths, step_executables) File "/gpfs/home/mays/git/make_lastz_chains/modules/pipeline_steps.py", line 88, in clean_chains_step do_chains_clean(params, project_paths, executables) File "/gpfs/home/mays/git/make_lastz_chains/steps_implementations/clean_chain_step.py", line 31, in do_chains_clean _output_chain = input_chain.removesuffix(".gz") AttributeError: 'str' object has no attribute 'removesuffix'

However, I follow Alchimic007's command (chainCleaner /gpfs/home/mays/git/make_lastz_chains/test_out/temp_chain_run/target.query.before_cleaning.chain.gz ./test_out/target.2bit ./test_out/query.2bit /gpfs/home/mays/git/make_lastz_chains/test_out/temp_chain_run/target.query.filled.chain.gz removedSuspects.bed -linearGap=loose -tSizes=./test_out/target.chrom.sizes -qSizes=./test_out/query.chrom.sizes -LRfoldThreshold=2.5 -doPairs -LRfoldThresholdPairs=10 -maxPairDistance=10000 -maxSuspectScore=100000 -minBrokenChainScore=75000) to do the chainCleaner, It shows Ok, the output file generated successfully, I don't know why?

Verbosity level: 1 foldThreshold: 0.000000 LRfoldThreshold: 2.500000 maxSuspectBases: 2147483647 maxSuspectScore: 100000 minBrokenChainScore: 75000 minLRGapSize: 0 doPairs with LRfoldThreshold: 10.000000 maxPairDistance 10000 0. need to net the input chains /gpfs/home/mays/git/make_lastz_chains/test_out/temp_chain_run/target.query.before_cleaning.chain.gz (no net file given) ... tempfile for netting: tmp.chainCleaner.XonZACT.net Got 2 chroms in ./test_out/target.chrom.sizes, 2 in ./test_out/query.chrom.sizes Finishing nets writing stdout writing /dev/null DONE (nets in tmp.chainCleaner.XonZACT.net)

  1. parsing fills/gaps from tmp.chainCleaner.XonZACT.net and getting valid breaks ... 1.1 read net file tmp.chainCleaner.XonZACT.net into memory ... DONE

1.2 get fills/gaps from tmp.chainCleaner.XonZACT.net ... DONE

1.3 get aligning regions from tmp.chainCleaner.XonZACT.net ... DONE

1.4 get valid breaks ... DONE Remove temporary netfile tmp.chainCleaner.XonZACT.net DONE (parsing fills/gaps and getting valid breaks)

  1. reading breaking and broken chains from /gpfs/home/mays/git/make_lastz_chains/test_out/temp_chain_run/target.query.before_cleaning.chain.gz and write irrelevant chains to /gpfs/home/mays/git/make_lastz_chains/test_out/temp_chain_run/target.query.filled.chain.gz.unsorted ... DONE
  2. reading target and query DNA sequences for breaking and broken chains ... DONE
  3. loop over all breaks. Remove suspects if they pass our filters and write out deleted suspects to removedSuspects.bed ... DONE
  4. write the (new) breaking and the broken chains to /gpfs/home/mays/git/make_lastz_chains/test_out/temp_chain_run/target.query.filled.chain.gz.unsorted ... DONE
  5. chainSort /gpfs/home/mays/git/make_lastz_chains/test_out/temp_chain_run/target.query.filled.chain.gz.unsorted /gpfs/home/mays/git/make_lastz_chains/test_out/temp_chain_run/target.query.filled.chain.gz ... DONE
  6. free memory ... DONE

memory usage 58077184, utime 0 s/100, stime 0

ALL DONE. New chains are in /gpfs/home/mays/git/make_lastz_chains/test_out/temp_chain_run/target.query.filled.chain.gz. Deleted suspects in removedSuspects.bed

I have sovled this problem, i update my python to 3.12.2, Thank you! @Alchimic007

Wanchengshan commented 6 days ago

I encountered the same error with the Clean chains step. The steps.json file shows: "partition": "completed", "lastz": "completed", "cat": "completed", "chain_run": "completed", "chain_merge": "completed", "fill_chains": "completed", "clean_chains": "failed" The run.log file shows: An error occurred while executing clean_chains: chain cleaner process died with the following error message: Verbosity level: 1 foldThreshold: 0.000000 LRfoldThreshold: 2.500000 maxSuspectBases: 2147483647 maxSuspectScore: 100000 minBrokenChainScore: 75000 minLRGapSize: 0 doPairs with LRfoldThreshold: 10.000000 maxPairDistance 10000

  1. need to net the input chains /media/dell/a55bf422-a680-40ff-b84e-7b85c78d4e48/home/chengbin/Documents/Repo/make_lastz_chains/test_out_3sample/BTXL/temp_chain_run/target.query.before_cleaning.chain.gz (no net file given) ... tempfile for netting: tmp.chainCleaner.XMYUfNk.net Got 464 chroms in /media/dell/a55bf422-a680-40ff-b84e-7b85c78d4e48/home/chengbin/Documents/Repo/make_lastz_chains/test_out_3sample/BTXL/target.chrom.sizes, 2155 in /media/dell/a55bf422-a680-40ff-b84e-7b85c78d4e48/home/chengbin/Documents/Repo/make_lastz_chains/test_out_3sample/BTXL/query.chrom.sizes Finishing nets writing tmp.chainCleaner.XMYUfNk.net.raw writing /dev/null DONE (nets in tmp.chainCleaner.XMYUfNk.net) I am unsure how to resolve this issue and would greatly appreciate it if anybody could help me out.
MichaelHiller commented 2 days ago

Hi Wanchengshan,

I don't see any error message.

Can you send me the input data (all input) and the exact chainCleaner command that you are running? I'll try to recap this error on my system.