qiime2 / q2-alignment

Support for multiple sequence alignment in QIIME 2.
BSD 3-Clause "New" or "Revised" License
2 stars 21 forks source link

MAFFT's usage of stdout + direct file handle redirection in `run_command` doesn't play nicely with some environments #42

Open jakereps opened 6 years ago

jakereps commented 6 years ago

Bug Description mafft stdout/stderr implementation does not work in a Jupyterhub deployment.

Screenshots

Running external command line application. This may print messages to stdout and/or stderr.
The command being run is below. This command cannot be manually re-run as it will depend on temporary files that no longer exist.

Command: mafft --preservecase --inputorder --thread 1 /tmp/qiime2-archive-xr3zed18/1fe37b20-35c7-4a07-b0d7-f9e7a3d6b30f/data/dna-sequences.fasta

/home/jorden/dev/mc3/envs/biota/lib/python3.5/site-packages/skbio/io/registry.py:548: FormatIdentificationWarning: <_io.BufferedReader name='/tmp/q2-AlignedDNAFASTAFormat-4i04u915'> does not look like a fasta file
  % (file, fmt), FormatIdentificationWarning)

---------------------------------------------------------------------------
AssertionError                            Traceback (most recent call last)
<ipython-input-5-59f24acdb7a5> in <module>()
----> 1 biota.util.make_tree(f.DenoisedSequenceVariant)

~/dev/biota_code/biota/util.py in make_tree(seqs, tree_path, overwrite, n_threads)
    347     if not os.path.isfile(tree_path) or overwrite:
    348         q_seqs = qiime2.Artifact.import_data("FeatureData[Sequence]", seqs)
--> 349         aligned_seqs, = alignment.methods.mafft(q_seqs, n_threads=n_threads)
    350         filtered_aligned_seqs, = alignment.methods.mask(aligned_seqs)
    351         tree, = fasttree(filtered_aligned_seqs, n_threads=n_threads)

<decorator-gen-410> in mafft(sequences, n_threads)

~/dev/mc3/envs/biota/lib/python3.5/site-packages/qiime2/sdk/action.py in bound_callable(*args, **kwargs)
    226                 # Execute
    227                 outputs = self._callable_executor_(scope, callable_args,
--> 228                                                    output_types, provenance)
    229 
    230                 if len(outputs) != len(self.signature.outputs):

~/dev/mc3/envs/biota/lib/python3.5/site-packages/qiime2/sdk/action.py in _callable_executor_(self, scope, view_args, output_types, provenance)
    361 
    362     def _callable_executor_(self, scope, view_args, output_types, provenance):
--> 363         output_views = self._callable(**view_args)
    364         output_views = tuplize(output_views)
    365 

~/dev/caporasolab/q2-alignment/q2_alignment/_mafft.py in mafft(sequences, n_threads)
     68     # while aligning, which would be a bug on mafft's end. This is just a
     69     # sanity check and is not expected to trigger in practice.
---> 70     assert len(ids) == len(msa)
     71     for id, seq in zip(ids, msa):
     72         seq.metadata['id'] = id

AssertionError: 

Comments Changing to subprocess.run(cmd, stdout=subprocess.PIPE, stderr=subprocess.PIPE...) resolves the issue.

jakereps commented 6 years ago

Updated to reflect the actual error encountered. Got the other ones resolved, but this is the one that showed once and is now the final issue with the way mafft's call is set up.