songweizhi / MetaCHIP

Horizontal gene transfer (HGT) identification pipeline
GNU Affero General Public License v3.0
54 stars 14 forks source link

Find HGT between genes of different genomes #17

Open ga23981 opened 3 years ago

ga23981 commented 3 years ago

I made multifasta files of genes of interest of several genomes and used them as input files in a directory called genomes as following singularity exec /apps/singularity-images/metachip_1.10.2.sif MetaCHIP PI -p pantoea -g customised_grouping.txt -t 30 -i genomes -x fasta

customised_grouping.txt file looks as following: A,PNA_99_2 A,PNA_99_3 A,PNA_99_6 A,PNA_99_7 A,PNA_99_8 A,PNA_99_9 B,PANS_2_1 B,PANS_4_2 B,PANS_99_32 B,PNA_07_13 B,PNA_07_14 B,PNA_98_11 B,PNA_98_3 B,PNA_98_7

Getting an error:

WARNING: Skipping mount /var/singularity/mnt/session/etc/resolv.conf [files]: /etc/resolv.conf doesn't exist in container

Error: File existence/permissions problem in trying to open HMM file pantoea_MetaCHIP_wd/pantoea_x_get_SCG_tree_wd/pantoea_x_hmm_profile_fetched/.hmm. HMM file pantoea_MetaCHIP_wd/pantoea_x_get_SCG_tree_wd/pantoea_x_hmm_profile_fetched/.hmm not found (nor an .h3m binary of it) Traceback (most recent call last): File "/usr/local/bin/MetaCHIP", line 165, in PI(args, MetaCHIP_config.config_dict) File "/usr/local/lib/python3.8/site-packages/MetaCHIP/PI.py", line 974, in PI remove_low_cov_and_consensus_columns(pwd_combined_alignment_file_tmp, minimal_cov_in_msa, min_consensus_in_msa, pwd_combined_alignment_file) File "/usr/local/lib/python3.8/site-packages/MetaCHIP/PI.py", line 537, in remove_low_cov_and_consensus_columns alignment_cov = remove_low_cov_columns(alignment, minimal_cov) File "/usr/local/lib/python3.8/site-packages/MetaCHIP/PI.py", line 496, in remove_low_cov_columns alignment_new = remove_columns_from_msa(alignment_in, low_cov_columns) File "/usr/local/lib/python3.8/site-packages/MetaCHIP/PI.py", line 464, in remove_columns_from_msa segment_value = alignment_in[:, segment[0]] File "/usr/local/lib/python3.8/site-packages/Bio/Align/init.py", line 848, in getitem new = MultipleSeqAlignment( File "/usr/local/lib/python3.8/site-packages/Bio/Align/init.py", line 170, in init self.extend(records) File "/usr/local/lib/python3.8/site-packages/Bio/Align/init.py", line 533, in extend rec = next(records) File "/usr/local/lib/python3.8/site-packages/Bio/Align/init.py", line 849, in (rec[col_index] for rec in self._records[row_index]), self._alphabet File "/usr/local/lib/python3.8/site-packages/Bio/SeqRecord.py", line 524, in getitem raise ValueError("Invalid index") ValueError: Invalid index

The program however generated an incomplete output directory pantoea_MetaCHIP_wd with these three subdirectories antoea_x_get_SCG_tree_wd
pantoea_x_log_files pantoea_x_prodigal_output

As a result the BP command does not generate HGT output files because not blastall files were reported by PI command

I can also share my input genome files if needed.

songweizhi commented 3 years ago

Can you please update MetaCHIP to version 1.10.3 and try again?

ga23981 commented 3 years ago

I tried the new version. The PI command generated empty .faa, .ffn, .gbk and .sco files. What could be the possible reason for this? If needed I can share one or all of my input fasta files. Just to let you know all the fasta files contain only 12-15 genes that I want to analyze for HGT.

songweizhi commented 3 years ago

please share with me some of your input files so I can look into it, thanks, Weizhi

ga23981 commented 3 years ago

Hello Weizhi, Please see attached some of the fasta files.

Let me know if you need any more details.

Thank you, Gaurav

On Mon, Mar 8, 2021 at 4:27 PM Weizhi Song notifications@github.com wrote:

please share with me some of your input files so I can look into it, thanks, Weizhi

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/songweizhi/MetaCHIP/issues/17#issuecomment-793091490, or unsubscribe https://github.com/notifications/unsubscribe-auth/AMOL6IIAZQ77SSR5MXGQZI3TCU6M3ANCNFSM4YQEK4GQ .

ga23981 commented 3 years ago

After I updated the biopython version, I am getting the following error using the files I shared with you. If needed I can share all the files with you.

Error: File existence/permissions problem in trying to open HMM file pantoea_MetaCHIP_wd/pantoea_x_get_SCG_tree_wd/pantoea_x_hmm_profile_fetched/.hmm.

HMM file pantoea_MetaCHIP_wd/pantoea_x_get_SCG_tree_wd/pantoea_x_hmm_profile_fetched/.hmm not found (nor an .h3m binary of it)

Traceback (most recent call last):

File "/apps/eb/MetaCHIP/1.10.3-foss-2019b-Python-3.8.2/bin/MetaCHIP", line 165, in

PI(args, MetaCHIP_config.config_dict)

File "/apps/eb/MetaCHIP/1.10.3-foss-2019b-Python-3.8.2/lib/python3.8/site-packages/MetaCHIP/PI.py", line 958, in PI

remove_low_cov_and_consensus_columns(pwd_combined_alignment_file_tmp,

minimal_cov_in_msa, min_consensus_in_msa, pwd_combined_alignment_file)

File "/apps/eb/MetaCHIP/1.10.3-foss-2019b-Python-3.8.2/lib/python3.8/site-packages/MetaCHIP/PI.py", line 521, in remove_low_cov_and_consensus_columns

alignment_cov = remove_low_cov_columns(alignment, minimal_cov)

File "/apps/eb/MetaCHIP/1.10.3-foss-2019b-Python-3.8.2/lib/python3.8/site-packages/MetaCHIP/PI.py", line 480, in remove_low_cov_columns

alignment_new = remove_columns_from_msa(alignment_in, low_cov_columns)

File "/apps/eb/MetaCHIP/1.10.3-foss-2019b-Python-3.8.2/lib/python3.8/site-packages/MetaCHIP/PI.py", line 448, in remove_columns_from_msa

segment_value = alignment_in[:, segment[0]]

File "/apps/eb/Biopython/1.78-foss-2019b-Python-3.8.2/lib/python3.8/site-packages/Bio/Align/init.py", line 823, in getitem

new = MultipleSeqAlignment(

File "/apps/eb/Biopython/1.78-foss-2019b-Python-3.8.2/lib/python3.8/site-packages/Bio/Align/init.py", line 162, in init

self.extend(records)

File "/apps/eb/Biopython/1.78-foss-2019b-Python-3.8.2/lib/python3.8/site-packages/Bio/Align/init.py", line 514, in extend

rec = next(records)

File "/apps/eb/Biopython/1.78-foss-2019b-Python-3.8.2/lib/python3.8/site-packages/Bio/Align/init.py", line 824, in

(rec[col_index] for rec in self._records[row_index])

File "/apps/eb/Biopython/1.78-foss-2019b-Python-3.8.2/lib/python3.8/site-packages/Bio/SeqRecord.py", line 520, in getitem

raise ValueError("Invalid index")

ValueError: Invalid index

Thank you, Gaurav

On Mon, Mar 8, 2021 at 10:32 PM Gaurav Agarwal gaurav.iari@gmail.com wrote:

Hello Weizhi, Please see attached some of the fasta files.

Let me know if you need any more details.

Thank you, Gaurav

On Mon, Mar 8, 2021 at 4:27 PM Weizhi Song notifications@github.com wrote:

please share with me some of your input files so I can look into it, thanks, Weizhi

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/songweizhi/MetaCHIP/issues/17#issuecomment-793091490, or unsubscribe https://github.com/notifications/unsubscribe-auth/AMOL6IIAZQ77SSR5MXGQZI3TCU6M3ANCNFSM4YQEK4GQ .

bahiyahazli commented 1 year ago

Hello, I am getting the same error for my data. Do you perhaps successfully resolve this issue?

Thank you