nanoporetech / pore-c

Pore-C support
Mozilla Public License 2.0
53 stars 6 forks source link

Bio.Alphabet removed #40

Closed natforsdick closed 3 years ago

natforsdick commented 3 years ago

I'm trying to get use the Pore-C-Snakemake wrapper for Pore-C tools, but I've run into an issue that I'm not sure how to solve, and was hoping you may be able to help.

Pore-C tools and the Snakemake wrapper are installed and running using Miniconda v4.8.3. During the virtual_digest steps, I get an error printed to the virtual_digest specific log as follows:

"Bio.Alphabet has been removed from Biopython. In many cases, the alphabet can simply be ignored and removed from scripts. In a few cases, you may need to specify the molecule_type as an annotation on a SeqRecord for your script to work correctly. Please see https://biopython.org/wiki/Alphabet for more information."

I found a single instance in all of the program files where Pore-C tools was calling Bio.Alphabet, in the pore-c reference.py file: from Bio.Alphabet.IUPAC import IUPACAmbiguousDNA

My understanding is that this needs to be replaced with the equivalent Bio.Seq, but I'm not sure what that should be. Any help would be much appreciated.

eharr commented 3 years ago

Hi @natforsdick,

This will be fixed in a future version of the tools but for now you can fix manually by doing conda install biopython ==1.77 in your pore_c environment.

All the best, Eoghan

eharr commented 3 years ago

@natforsdick in the latest version of pore-c tools the version of biopython has been pinned so you shouldn't get this error.

iagooteroc commented 3 years ago

Hi @eharr , I'm having the same issue even with the last version of pore-c. This is my log of the virtual digest:

Traceback (most recent call last):
  File "/mnt/netapp1/genomes_tmp/pore-c-snakemake-test/H2087/.snakemake/conda/502228d8/bin/pore_c", line 10, in <module>
    sys.exit(cli())
  File "/mnt/netapp1/genomes_tmp/pore-c-snakemake-test/H2087/.snakemake/conda/502228d8/lib/python3.8/site-packages/click/core.py", line 829, in __call__
    return self.main(*args, **kwargs)
  File "/mnt/netapp1/genomes_tmp/pore-c-snakemake-test/H2087/.snakemake/conda/502228d8/lib/python3.8/site-packages/click/core.py", line 782, in main
    rv = self.invoke(ctx)
  File "/mnt/netapp1/genomes_tmp/pore-c-snakemake-test/H2087/.snakemake/conda/502228d8/lib/python3.8/site-packages/click/core.py", line 1259, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/mnt/netapp1/genomes_tmp/pore-c-snakemake-test/H2087/.snakemake/conda/502228d8/lib/python3.8/site-packages/click/core.py", line 1259, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/mnt/netapp1/genomes_tmp/pore-c-snakemake-test/H2087/.snakemake/conda/502228d8/lib/python3.8/site-packages/click/core.py", line 1066, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/mnt/netapp1/genomes_tmp/pore-c-snakemake-test/H2087/.snakemake/conda/502228d8/lib/python3.8/site-packages/click/core.py", line 610, in invoke
    return callback(*args, **kwargs)
  File "/mnt/netapp1/genomes_tmp/pore-c-snakemake-test/H2087/.snakemake/conda/502228d8/lib/python3.8/site-packages/click/decorators.py", line 21, in new_func
    return f(get_current_context(), *args, **kwargs)
  File "/mnt/netapp1/genomes_tmp/pore-c-snakemake-test/H2087/.snakemake/conda/502228d8/lib/python3.8/site-packages/pore_c/cli.py", line 198, in virtual_digest
    frag_df, summary_stats = create_virtual_digest(fasta, digest_type, cut_on, **path_kwds)
  File "/mnt/netapp1/genomes_tmp/pore-c-snakemake-test/H2087/.snakemake/conda/502228d8/lib/python3.8/site-packages/pore_c/analyses/reference.py", line 51, in create_virtual_digest
    seq_bag.map(lambda x: (x["seqid"], x["seq"], digest_type, digest_param))
  File "/mnt/netapp1/genomes_tmp/pore-c-snakemake-test/H2087/.snakemake/conda/502228d8/lib/python3.8/site-packages/dask/base.py", line 167, in compute
    (result,) = compute(self, traverse=False, **kwargs)
  File "/mnt/netapp1/genomes_tmp/pore-c-snakemake-test/H2087/.snakemake/conda/502228d8/lib/python3.8/site-packages/dask/base.py", line 452, in compute
    results = schedule(dsk, keys, **kwargs)
  File "/mnt/netapp1/genomes_tmp/pore-c-snakemake-test/H2087/.snakemake/conda/502228d8/lib/python3.8/site-packages/dask/multiprocessing.py", line 218, in get
    result = get_async(
  File "/mnt/netapp1/genomes_tmp/pore-c-snakemake-test/H2087/.snakemake/conda/502228d8/lib/python3.8/site-packages/dask/local.py", line 486, in get_async
    raise_exception(exc, tb)
  File "/mnt/netapp1/genomes_tmp/pore-c-snakemake-test/H2087/.snakemake/conda/502228d8/lib/python3.8/site-packages/dask/local.py", line 316, in reraise
    raise exc
  File "/mnt/netapp1/genomes_tmp/pore-c-snakemake-test/H2087/.snakemake/conda/502228d8/lib/python3.8/site-packages/dask/local.py", line 222, in execute_task
    result = _execute_task(task, data)
  File "/mnt/netapp1/genomes_tmp/pore-c-snakemake-test/H2087/.snakemake/conda/502228d8/lib/python3.8/site-packages/dask/core.py", line 121, in _execute_task
    return func(*(_execute_task(a, cache) for a in args))
  File "/mnt/netapp1/genomes_tmp/pore-c-snakemake-test/H2087/.snakemake/conda/502228d8/lib/python3.8/site-packages/dask/bag/core.py", line 1777, in reify
    seq = list(seq)
  File "/mnt/netapp1/genomes_tmp/pore-c-snakemake-test/H2087/.snakemake/conda/502228d8/lib/python3.8/site-packages/pore_c/analyses/reference.py", line 169, in create_fragment_dataframe
    DataFrame(find_fragment_intervals(digest_type, digest_param, seq))
  File "/mnt/netapp1/genomes_tmp/pore-c-snakemake-test/H2087/.snakemake/conda/502228d8/lib/python3.8/site-packages/pore_c/analyses/reference.py", line 123, in find_fragment_intervals
    positions = find_site_positions_biopython(digest_param, seq)
  File "/mnt/netapp1/genomes_tmp/pore-c-snakemake-test/H2087/.snakemake/conda/502228d8/lib/python3.8/site-packages/pore_c/analyses/reference.py", line 156, in find_site_positions_biopython
    from Bio.Alphabet.IUPAC import IUPACAmbiguousDNA
  File "/home/usc/mg/ioc/.local/lib/python3.8/site-packages/Bio/Alphabet/__init__.py", line 20, in <module>
    raise ImportError(
ImportError: Bio.Alphabet has been removed from Biopython. In many cases, the alphabet can simply be ignored and removed from scripts. In a few cases, you may need to specify the ``molecule_type`` as an annotation on a SeqRecord for your script to work correctly. Please see https://biopython.org/wiki/Alphabet for more information.

Notice how it ends up using my .local Biopython package instead of the one installed at .snakemake/conda/502228d8 Could it be fixed by specifying the biopython version in pore-c/environment.yml ?

Thank you, Iago Otero.