phac-nml / refseq_masher

Mash MinHash search your nucleotide sequences against a NCBI RefSeq genomes database
Apache License 2.0
39 stars 4 forks source link

contains command is broken via conda install #2

Open ekg opened 5 years ago

ekg commented 5 years ago

I installed via:

conda install -c bioconda refseq_masher

But I can't seem to get contains to work:

-> % time refseq_masher contains --top-n-results 50 -p 12 gut_bridge_sequences.fasta
Traceback (most recent call last):
  File "/home/erik/anaconda3/bin/refseq_masher", line 6, in <module>
    sys.exit(refseq_masher.cli.cli())
  File "/home/erik/.local/lib/python3.6/site-packages/click/core.py", line 722, in __call__
    return self.main(*args, **kwargs)
  File "/home/erik/.local/lib/python3.6/site-packages/click/core.py", line 697, in main
    rv = self.invoke(ctx)
  File "/home/erik/.local/lib/python3.6/site-packages/click/core.py", line 1066, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/home/erik/.local/lib/python3.6/site-packages/click/core.py", line 895, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/home/erik/.local/lib/python3.6/site-packages/click/core.py", line 535, in invoke
    return callback(*args, **kwargs)
  File "/home/erik/anaconda3/lib/python3.6/site-packages/refseq_masher/cli.py", line 130, in contains
    parallelism=parallelism)
  File "/home/erik/anaconda3/lib/python3.6/site-packages/refseq_masher/mash/screen.py", line 46, in vs_refseq
    df = mash_screen_output_to_dataframe(stdout)
  File "/home/erik/anaconda3/lib/python3.6/site-packages/refseq_masher/mash/parser.py", line 120, in mash_screen_output_to_dataframe
    df.sort_values(by=['identity', 'median_multiplicity'], ascending=[False, False], inplace=True)
  File "/home/erik/anaconda3/lib/python3.6/site-packages/pandas/core/frame.py", line 4408, in sort_values
    stacklevel=stacklevel)
  File "/home/erik/anaconda3/lib/python3.6/site-packages/pandas/core/generic.py", line 1379, in _get_label_or_level_values
    raise KeyError(key)
KeyError: 'median_multiplicity'
refseq_masher contains --top-n-results 50 -p 12 gut_bridge_sequences.fasta  0.73s user 0.31s system 135% cpu 0.761 total

Any ideas?

apetkau commented 5 years ago

Is this still an issue? I tried installing via conda and contains worked for me.

Sorry for the late response.

leomrtns commented 4 years ago

Just to add that I have a similar problem with conda + python3.6 . With conda+python3.7 it works fine.

$  refseq_masher contains out/contigs.fa 
Loading /usr/users/QIB_fr005/deolivl/local/miniconda3/envs/py36/lib/python3.6/site-packages/refseq_masher/data/RefSeqSketches.msh...
   4669418 distinct hashes.
Streaming from /usr/users/QIB_fr005/deolivl/Downloads/Charlys/Leo_01/out/contigs.fa...
   Estimated distinct k-mers in mixture: 698953
Summing shared...
Computing coverage medians...
Writing output...
Traceback (most recent call last):
  File "/usr/users/QIB_fr005/deolivl/local/miniconda3/envs/py36/bin/refseq_masher", line 10, in <module>
    sys.exit(cli())
  File "/usr/users/QIB_fr005/deolivl/local/miniconda3/envs/py36/lib/python3.6/site-packages/click/core.py", line 764, in __call__
    return self.main(*args, **kwargs)
  File "/usr/users/QIB_fr005/deolivl/local/miniconda3/envs/py36/lib/python3.6/site-packages/click/core.py", line 717, in main
    rv = self.invoke(ctx)
  File "/usr/users/QIB_fr005/deolivl/local/miniconda3/envs/py36/lib/python3.6/site-packages/click/core.py", line 1137, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/usr/users/QIB_fr005/deolivl/local/miniconda3/envs/py36/lib/python3.6/site-packages/click/core.py", line 956, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/usr/users/QIB_fr005/deolivl/local/miniconda3/envs/py36/lib/python3.6/site-packages/click/core.py", line 555, in invoke
    return callback(*args, **kwargs)
  File "/usr/users/QIB_fr005/deolivl/local/miniconda3/envs/py36/lib/python3.6/site-packages/refseq_masher/cli.py", line 136, in contains
    parallelism=parallelism)
  File "/usr/users/QIB_fr005/deolivl/local/miniconda3/envs/py36/lib/python3.6/site-packages/refseq_masher/mash/screen.py", line 46, in vs_refseq
    df = mash_screen_output_to_dataframe(stdout)
  File "/usr/users/QIB_fr005/deolivl/local/miniconda3/envs/py36/lib/python3.6/site-packages/refseq_masher/mash/parser.py", line 124, in mash_screen_output_to_dataframe
    dfmerge = pd.merge (dfmatch, df, on='match_id')
  File "/usr/users/QIB_fr005/deolivl/local/miniconda3/envs/py36/lib/python3.6/site-packages/pandas/core/reshape/merge.py", line 86, in merge
    validate=validate,
  File "/usr/users/QIB_fr005/deolivl/local/miniconda3/envs/py36/lib/python3.6/site-packages/pandas/core/reshape/merge.py", line 627, in __init__
    ) = self._get_merge_keys()
  File "/usr/users/QIB_fr005/deolivl/local/miniconda3/envs/py36/lib/python3.6/site-packages/pandas/core/reshape/merge.py", line 996, in _get_merge_keys
    left_keys.append(left._get_label_or_level_values(lk))
  File "/usr/users/QIB_fr005/deolivl/local/miniconda3/envs/py36/lib/python3.6/site-packages/pandas/core/generic.py", line 1692, in _get_label_or_level_values
    raise KeyError(key)
KeyError: 'match_id'

cheers