AstrobioMike / GToTree

A user-friendly workflow for phylogenomics
GNU General Public License v3.0
192 stars 25 forks source link

Error with pfam_counting.py #85

Closed SandersonHaley closed 4 months ago

SandersonHaley commented 4 months ago

python pfam_counting.py -p coverage_filtered_pfam_ids -g all_genome_accs -H All_bacterial_pfam_hmm_results.tab -o All_bacterial_pfam_hmm_counts.tsv

I tried with both the bit-toolkit and gtotree environments and both gave the same error.

Traceback (most recent call last): File "/gpfs/fs7/grdi/genarcc/wp3/common/conda/envs/gtotree/lib/python3.9/site-packages/pandas/core/indexes/base.py", line 3802, in get_loc return self._engine.get_loc(casted_key) File "index.pyx", line 153, in pandas._libs.index.IndexEngine.get_loc File "index.pyx", line 182, in pandas._libs.index.IndexEngine.get_loc File "pandas/_libs/hashtable_class_helper.pxi", line 7081, in pandas._libs.hashtable.PyObjectHashTable.get_item File "pandas/_libs/hashtable_class_helper.pxi", line 7089, in pandas._libs.hashtable.PyObjectHashTable.get_item KeyError: 'sequence'

The above exception was the direct cause of the following exception:

Traceback (most recent call last): File "/gpfs/fs7/grdi/genarcc/wp3/common/conda/envs/gtotree/share/gtotree/hmm_sets/pfam_counting.py", line 60, in df.loc[curr_acc,curr_pfam] = df.loc[curr_acc,curr_pfam] + 1 File "/gpfs/fs7/grdi/genarcc/wp3/common/conda/envs/gtotree/lib/python3.9/site-packages/pandas/core/indexing.py", line 1184, in getitem return self.obj._get_value(*key, takeable=self._takeable) File "/gpfs/fs7/grdi/genarcc/wp3/common/conda/envs/gtotree/lib/python3.9/site-packages/pandas/core/frame.py", line 4202, in _get_value series = self._get_item_cache(col) File "/gpfs/fs7/grdi/genarcc/wp3/common/conda/envs/gtotree/lib/python3.9/site-packages/pandas/core/frame.py", line 4626, in _get_item_cache loc = self.columns.get_loc(item) File "/gpfs/fs7/grdi/genarcc/wp3/common/conda/envs/gtotree/lib/python3.9/site-packages/pandas/core/indexes/base.py", line 3809, in get_loc raise KeyError(key) from err KeyError: 'sequence'

SandersonHaley commented 4 months ago

Alternatively, would you be able to share a pre-made bacteria.hmm? The conda install for gtotree did not include the default hmm files.

SandersonHaley commented 4 months ago

I found the links to the premade sets so no longer need to produce my own. I've closed this issue.

AstrobioMike commented 4 months ago

Sorry for the confusion. When you run the program for the first time, specifying one of the pre-made ones, GToTree downloads it automatically and then stores it in that location that comes up when you run gtt-hmms (this saves on install time and storage space). You don't need to download them yourself 👍