Closed susheelbhanu closed 4 years ago
Hi Susheel, will look into this issue. Meanwhile, can you please remove "plot_iden" from our command and try again? Cheers,
Thank you. I commented out the plot_iden
part from the script. Will let you know if that helps it!
got this..
Traceback (most recent call last):
File "/home/users/sbusi/apps/miniconda3/bin/MetaCHIP", line 244, in <module>
BM(args, config_dict)
File "/home/users/sbusi/apps/miniconda3/lib/python3.7/site-packages/MetaCHIP/BP.py", line 1971, in BM
plot_identity = args['plot_iden']
KeyError: 'plot_iden'
@songweizhi
I commented out the same form the BP.py
script but looks like it's linked to other issues
[sbusi@iris-187 hgt]$ MetaCHIP BP -p tx-2 -r g -t 24 -force
[2020-06-04 09:24:02] Found grouping file tx-2_g26_grouping.txt, input genomes were clustered into 26 groups
[2020-06-04 09:24:02] Filtered blastn results at specified taxonomic rank detected from folder tx-2_g26_blastn_results_filtered. HGT analysis will be performed based on these files.
[2020-06-04 09:24:02] Combining filtered blastn results
[2020-06-04 09:24:02] Get group-to-group identities with 24 cores
[2020-06-04 09:24:02] Plotting identity distribution between each pair of groups
[2020-06-04 09:24:02] Analyzing Blast hits to get HGT candidates with 24 cores
multiprocessing.pool.RemoteTraceback:
"""
Traceback (most recent call last):
File "/opt/apps/resif/data/production/v1.1-20180716/default/software/lang/Python/3.6.4-intel-2018a/lib/python3.6/multiprocessing/pool.py", line 119, in worker
result = (True, func(*args, **kwds))
File "/opt/apps/resif/data/production/v1.1-20180716/default/software/lang/Python/3.6.4-intel-2018a/lib/python3.6/multiprocessing/pool.py", line 44, in mapstar
return list(map(*args))
File "/home/users/sbusi/.local/lib/python3.6/site-packages/MetaCHIP/BP.py", line 1256, in get_HGT_worker
group_pair_iden_cutoff_dict)
File "/home/users/sbusi/.local/lib/python3.6/site-packages/MetaCHIP/BP.py", line 387, in get_candidates
query_gene_name = query_split[1]
IndexError: list index out of range
"""
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/home/users/sbusi/.local/bin/MetaCHIP", line 227, in <module>
BM(args, config_dict)
File "/home/users/sbusi/.local/lib/python3.6/site-packages/MetaCHIP/BP.py", line 2253, in BM
pool.map(get_HGT_worker, list_for_multiple_arguments_get_HGT)
File "/opt/apps/resif/data/production/v1.1-20180716/default/software/lang/Python/3.6.4-intel-2018a/lib/python3.6/multiprocessing/pool.py", line 266, in map
return self._map_async(func, iterable, mapstar, chunksize).get()
File "/opt/apps/resif/data/production/v1.1-20180716/default/software/lang/Python/3.6.4-intel-2018a/lib/python3.6/multiprocessing/pool.py", line 644, in get
raise self._value
IndexError: list index out of range
It works fine on my side, what's the version of your installation. Can you please try again with the latest version (1.9.0)?
I'm using 1.9.0
as well.. Please see below:
~/apps/metachip/MetaCHIP
...::: MetaCHIP v1.9.0 :::...
Core modules:
PI -> Prepare input files
BP -> Run Best-match and Phylogenetic approaches
Supplementary modules:
CMLP -> Combine multi-level predictions (part of BP module)
filter_HGT -> Get HGTs predicted at least n levels (for multi-level prediction)
update_hmms -> update hmm profiles used for inferring SCG tree
get_SCG_tree -> Get SCG protein tree
SankeyTaxon -> Visualize taxonomic classification with Sankey plot
circos_HGT -> Visualize gene flow with circos plot
rename_seqs -> Rename sequences in a file
# for command specific help
MetaCHIP PI -h
MetaCHIP BP -h
~/apps/metachip/MetaCHIP BP -p tx-2 -r g -t 24 -force
[2020-06-04 11:15:04] Found grouping file tx-2_g26_grouping.txt, input genomes were clustered into 26 groups
[2020-06-04 11:15:04] Filtered blastn results at specified taxonomic rank detected from folder tx-2_g26_blastn_results_filtered. HGT analysis will be performed based on these files.
[2020-06-04 11:15:04] Combining filtered blastn results
[2020-06-04 11:15:04] Get group-to-group identities with 24 cores
[2020-06-04 11:15:04] Plotting identity distribution between each pair of groups
Traceback (most recent call last):
File "/home/users/sbusi/apps/metachip/MetaCHIP", line 244, in <module>
BM(args, config_dict)
File "/home/users/sbusi/apps/miniconda3/envs/hgtector/lib/python3.8/site-packages/MetaCHIP/BP.py", line 2220, in BM
do(plot_identity)
File "/home/users/sbusi/apps/miniconda3/envs/hgtector/lib/python3.8/site-packages/MetaCHIP/BP.py", line 1934, in do
current_group_pair_identity_cut_off = np.percentile(current_group_pair_identities_array, identity_percentile)
File "<__array_function__ internals>", line 5, in percentile
File "/home/users/sbusi/apps/miniconda3/envs/hgtector/lib/python3.8/site-packages/numpy/lib/function_base.py", line 3705, in percentile
return _quantile_unchecked(
File "/home/users/sbusi/apps/miniconda3/envs/hgtector/lib/python3.8/site-packages/numpy/lib/function_base.py", line 3824, in _quantile_unchecked
r, k = _ureduce(a, func=_quantile_ureduce_func, q=q, axis=axis, out=out,
File "/home/users/sbusi/apps/miniconda3/envs/hgtector/lib/python3.8/site-packages/numpy/lib/function_base.py", line 3403, in _ureduce
r = func(a, **kwargs)
File "/home/users/sbusi/apps/miniconda3/envs/hgtector/lib/python3.8/site-packages/numpy/lib/function_base.py", line 3941, in _quantile_ureduce_func
x1 = take(ap, indices_below, axis=axis) * weights_below
File "<__array_function__ internals>", line 5, in take
File "/home/users/sbusi/apps/miniconda3/envs/hgtector/lib/python3.8/site-packages/numpy/core/fromnumeric.py", line 194, in take
return _wrapfunc(a, 'take', indices, axis=axis, out=out, mode=mode)
File "/home/users/sbusi/apps/miniconda3/envs/hgtector/lib/python3.8/site-packages/numpy/core/fromnumeric.py", line 61, in _wrapfunc
return bound(*args, **kwds)
IndexError: cannot do a non-empty take from an empty axes.
could you share with me your commands and some of your input files, hope I can reproduce the error. you can send me the google drive or dropbox link to songwz03@gmail.com Thanks for reporting this error :)
Thank you. Here are my commands:
MetaCHIP PI -p tx-2 -r g -t 24 -i cont-2 -x fa -taxon tx-2/tx-2_gtdbtk.tsv
MetaCHIP BP -p tx-2 -r g -t 24 -force
And the folder with the files I just sent you a link. let me know if you have trouble accessing them.
@songweizhi I did a clean re-install using the attached yaml
file.
I went past the previous error, but now a different error. See attached log file.
@songweizhi UPDATE: I used a new YAML file but still get an error. You can find the files here: https://drive.google.com/drive/folders/1MqqorrBTjouxJ1YTWT-S-ROGcOfpOizw?usp=sharing
The error has to do with not being able to find files. Which I checked and they don't exist.
[2020-06-05 12:48:28] PrepIn done!
[2020-06-05 12:48:29] Found grouping file tx-2_g26_grouping.txt, input genomes were clustered into 26 groups
[2020-06-05 12:48:29] Filtering blast matches with the following criteria: Query genome != Subject genome, Alignment length >= 200bp and coverage >= 75%
[2020-06-05 12:48:30] Combining filtered blastn results
[2020-06-05 12:48:30] Get group-to-group identities with 18 cores
[2020-06-05 12:48:30] Plotting identity distribution between each pair of groups
[2020-06-05 12:48:30] Analyzing Blast hits to get HGT candidates with 18 cores
[2020-06-05 12:48:31] Plotting flanking regions with 18 cores
Command line argument error: Argument "subject". File is not accessible: `tx-2_MetaCHIP_wd/tx-2_g26_HGTs_ip90_al200bp_c75_ei80_f10kbp/tx-2_g26_Flanking_region_plots/S6_MG_Paul_S5_G23_sub.contigs_00494___S6_MG_Paul_S5_maxbin_res.011.fasta_sub.contigs_00895/S6_MG_Paul_S5_maxbin_res.011.fasta_sub.contigs_00895_10000bp.fasta'
Command line argument error: Argument "subject". File is not accessible: `tx-2_MetaCHIP_wd/tx-2_g26_HGTs_ip90_al200bp_c75_ei80_f10kbp/tx-2_g26_Flanking_region_plots/S6_MG_Paul_S5_G23_sub.contigs_01650___S6_MG_Paul_S5_maxbin_res.011.fasta_sub.contigs_00895/S6_MG_Paul_S5_maxbin_res.011.fasta_sub.contigs_00895_10000bp.fasta'
Command line argument error: Argument "subject". File is not accessible: `tx-2_MetaCHIP_wd/tx-2_g26_HGTs_ip90_al200bp_c75_ei80_f10kbp/tx-2_g26_Flanking_region_plots/S6_MG_Paul_S5_G20.contigs_00776___S6_MG_Paul_S5_maxbin_res.008.fasta_sub.contigs_00148/S6_MG_Paul_S5_maxbin_res.008.fasta_sub.contigs_00148_10000bp.fasta'
Command line argument error: Argument "subject". File is not accessible: `tx-2_MetaCHIP_wd/tx-2_g26_HGTs_ip90_al200bp_c75_ei80_f10kbp/tx-2_g26_Flanking_region_plots/S6_MG_Paul_S5_G23_sub.contigs_01451___S6_MG_Paul_S5_maxbin_res.008.fasta_sub.contigs_01906/S6_MG_Paul_S5_maxbin_res.008.fasta_sub.contigs_01906_10000bp.fasta'
Command line argument error: Argument "subject". File is not accessible: `tx-2_MetaCHIP_wd/tx-2_g26_HGTs_ip90_al200bp_c75_ei80_f10kbp/tx-2_g26_Flanking_region_plots/S6_MG_Paul_S5_G4.1.contigs_00956___S6_MG_Paul_S5_maxbin_res.008.fasta_sub.contigs_02021/S6_MG_Paul_S5_maxbin_res.008.fasta_sub.contigs_02021_10000bp.fasta'
Command line argument error: Argument "subject". File is not accessible: `tx-2_MetaCHIP_wd/tx-2_g26_HGTs_ip90_al200bp_c75_ei80_f10kbp/tx-2_g26_Flanking_region_plots/S6_MG_Paul_S5_G23_sub.contigs_00494___S6_MG_Paul_S5_maxbin_res.011.fasta_sub.contigs_00895/S6_MG_Paul_S5_maxbin_res.011.fasta_sub.contigs_00895.fasta'
Command line argument error: Argument "subject". File is not accessible: `tx-2_MetaCHIP_wd/tx-2_g26_HGTs_ip90_al200bp_c75_ei80_f10kbp/tx-2_g26_Flanking_region_plots/S6_MG_Paul_S5_G23_sub.contigs_00236___S6_MG_Paul_S5_maxbin_res.008.fasta_sub.contigs_00331/S6_MG_Paul_S5_maxbin_res.008.fasta_sub.contigs_00331_10000bp.fasta'
Command line argument error: Argument "subject". File is not accessible: `tx-2_MetaCHIP_wd/tx-2_g26_HGTs_ip90_al200bp_c75_ei80_f10kbp/tx-2_g26_Flanking_region_plots/S6_MG_Paul_S5_G23_sub.contigs_01650___S6_MG_Paul_S5_maxbin_res.011.fasta_sub.contigs_00895/S6_MG_Paul_S5_maxbin_res.011.fasta_sub.contigs_00895.fasta'
@songweizhi
It WORKS.. (for the most part).. the issue was the file names. I replaced all the file names and made them very small, i.e. something like test_1.fa
. It's not just the contig_names
but the actual
filenames that were causing an issue.
It ran past the previous error, but now there is a RANGER-DTL2 error. See below (and attached file):
2020-06-05 18:26:11] Plotting flanking regions with 18 cores
[2020-06-05 18:29:54] Extracting nc sequences for BM predicted HGTs
[2020-06-05 18:29:55] Deleting temporary files
[2020-06-05 18:29:55] Done for Best-match approach!
[2020-06-05 18:29:55] Found grouping file tx-2_g26_grouping.txt, input genomes were clustered into 26 groups
[2020-06-05 18:29:55] Get gene/genome member in gene/species tree for each BM predicted HGT
[2020-06-05 18:29:55] Prepare subset of tx-2_all_combined_faa.fasta for building gene tree
[2020-06-05 18:29:56] Get species/gene tree for 571 BM approach identified HGTs with 18 cores
[2020-06-05 18:30:31] Running Ranger-DTL2 with dated mode
ERROR: missing ')' in input tree expression line 2 column 35
ERROR: missing ')' in input tree expression line 2 column 15
ERROR: missing ')' in input tree expression line 2 column 35
ERROR: missing ')' in input tree expression line 2 column 15
ERROR: missing ')' in input tree expression line 2 column 14
ERROR: missing ')' in input tree expression line 2 column 14
ERROR: missing ')' in input tree expression line 2 column 14
ERROR: missing ')' in input tree expression line 2 column 15
ERROR: missing ')' in input tree expression line 2 column 34
ERROR: missing ')' in input tree expression line 2 column 14
ERROR: missing ')' in input tree expression line 2 column 88
ERROR: missing ')' in input tree expression line 2 column 14
ERROR: missing ')' in input tree expression line 2 column 14
ERROR: missing ')' in input tree expression line 2 column 59
ERROR: missing ')' in input tree expression line 2 column 15
ERROR: missing ')' in input tree expression line 2 column 56
ERROR: missing ')' in input tree expression line 2 column 15
ERROR: missing ')' in input tree expression line 2 column 34
ERROR: missing ')' in input tree expression line 2 column 34
ERROR: missing ')' in input tree expression line 2 column 15
ERROR: missing ')' in input tree expression line 2 column 57
ERROR: missing ')' in input tree expression line 2 column 57
ERROR: missing ')' in input tree expression line 2 column 59
ERROR: missing ')' in input tree expression line 2 column 14
ERROR: missing ')' in input tree expression line 2 column 67
ERROR: missing ')' in input tree expression line 2 column 67
ERROR: missing ')' in input tree expression line 2 column 15
ERROR: missing ')' in input tree expression line 2 column 34
ERROR: missing ')' in input tree expression line 2 column 15
[2020-06-05 18:30:33] Parsing Ranger prediction results
[2020-06-05 18:30:33] Add Ranger-DTL predicted direction to HGT_candidates.txt
[2020-06-05 18:30:33] Deleting temporary files
[2020-06-05 18:30:38] Done for Phylogenetic approach!
== Ending run at Fri Jun 5 18:30:44 CEST 2020
@songweizhi With the new updated v.1.9.1
MetaCHIP, there are no more errors. Here's the output for the BP
module alone.
(METACHIP) [sbusi@iris-001 metachip]$ MetaCHIP BP -p tx-3 -r g -t 24 -force -tmp
[2020-06-07 17:12:30] Found grouping file tx-3_g27_grouping.txt, input genomes were clustered into 27 groups
[2020-06-07 17:12:33] Filtered blastn results at specified taxonomic rank detected from folder tx-3_g27_blastn_results_filtered. HGT analysis will be performed based on these files.
[2020-06-07 17:12:33] Combining filtered blastn results
[2020-06-07 17:12:35] Get group-to-group identities with 24 cores
[2020-06-07 17:12:36] Plotting identity distribution between each pair of groups
[2020-06-07 17:12:37] Analyzing Blast hits to get HGT candidates with 24 cores
[2020-06-07 17:12:39] Plotting flanking regions with 24 cores
[2020-06-07 17:20:44] Extracting nc sequences for BM predicted HGTs
[2020-06-07 17:20:46] Done for Best-match approach!
[2020-06-07 17:20:46] Found grouping file tx-3_g27_grouping.txt, input genomes were clustered into 27 groups
[2020-06-07 17:20:46] Get gene/genome member in gene/species tree for each BM predicted HGT
[2020-06-07 17:20:46] Prepare subset of tx-3_all_combined_faa.fasta for building gene tree
[2020-06-07 17:20:48] Get species/gene tree for 781 BM approach identified HGTs with 24 cores
[2020-06-07 17:24:47] Running Ranger-DTL2 with dated mode
[2020-06-07 17:24:55] Parsing Ranger prediction results
[2020-06-07 17:24:55] Add Ranger-DTL predicted direction to HGT_candidates.txt
[2020-06-07 17:24:55] Deleting temporary files
[2020-06-07 17:24:55] Done for Phylogenetic approach!
(METACHIP) [sbusi@iris-001 metachip]$
Thanks for all the help!
@songweizhi
I ran metaCHIP successfully in the past, but when I tried it again today, ran into the following issues:
is the blast index related to the plotting error? is there a workaround?
Thank you!