CCMS-UCSD / GNPS_Workflows

Public Workflows at GNPS
https://gnps.ucsd.edu/
Other
52 stars 43 forks source link

[Qemistree] Initial Implementation #195

Closed mwang87 closed 4 years ago

mwang87 commented 5 years ago
mwang87 commented 5 years ago

initial working implementation:

https://proteomics2.ucsd.edu/ProteoSAFe/status.jsp?task=5c80591e7db449ff98c509381f172353

Issues with qemistree tree creation.

mwang87 commented 5 years ago

Fixed issues with tree and network creation:

https://proteomics2.ucsd.edu/ProteoSAFe/status.jsp?task=252fd83712dd461babc317b3c3c25ef6

mwang87 commented 5 years ago

Example Jobs

Description FBMN QEMISTREE
Small Test Qemistree Task
Louis EMP FBMN Task Qemistree Task
C18 QE Qemistree Task
C18 QTOF Qemistree Task
mwang87 commented 5 years ago

Its very slow, likely worth using cplex for ILP solver along with it.

mwang87 commented 5 years ago

@anupriyatripathi See job:

https://gnps.ucsd.edu/ProteoSAFe/status.jsp?task=53f48becfb764a93a028345ac2fb0b38

Please download the output, and try creating the tree. Seems to not generate.

anupriyatripathi commented 4 years ago

The tasks look good. I was able to take fingerprints.qza from GNPS workflow and generate a hierarchy locally. I am re-running the tasks using the latest workflow to see if trees are produced.
An update in the first task: we do need a feature table to run hierarchy generation (make_hierarchy())

anupriyatripathi commented 4 years ago

Updated links for the above tasks:

Description FBMN Qemistree
C18 QE Job Job
C18 QTOF Job Job
Tomato Endophyte Job Job
mwang87 commented 4 years ago
anupriyatripathi commented 4 years ago
mwang87 commented 4 years ago

https://gnps-structure.ucsd.edu/structuresimilarity?smiles1={}&smiles2={}

mwang87 commented 4 years ago

Added metadata on proteomics2. Ready for @anupriyatripathi to test.

mwang87 commented 4 years ago

Qemistree has an issue running. using the command:

qiime diversity beta-phylogenetic --i-phylogeny output_folder/merged_data.qza --p-metric "weighted_unifrac" --o-distance-matrix output_folder/distance_matrix.qza

I get this error

(1/2) Invalid value for "--i-phylogeny": Expected an artifact of at least type Phylogeny[Rooted]. An artifact of type FeatureData[Molecules] was provided. (2/2) Missing option "--i-table".

mwang87 commented 4 years ago

For this command:

qiime diversity beta-phylogenetic --i-table output_folder/merged_feature_table.qza --i-phylogeny output_folder/qemistree.qza --p-metric "weighted_unifrac" --o-distance-matrix output_folder/distance_matrix.qza

Current see this bug:

Plugin error from diversity:

'latin-1' codec can't encode character '\u03b2' in position 3405: ordinal not in range(256)

Debug info has been saved to /tmp/qiime2-q2cli-err-4ocq2823.log

Here is the full traceback

Traceback (most recent call last):
  File "/data/beta-proteomics2/tools/miniconda3_gamma/envs/qiime2-2019.4/lib/python3.6/site-packages/q2cli/commands.py", line 311, in __call__
    results = action(**arguments)
  File "</data/beta-proteomics2/tools/miniconda3_gamma/envs/qiime2-2019.4/lib/python3.6/site-packages/decorator.py:decorator-gen-385>", line 2, in beta_phylogenetic
  File "/data/beta-proteomics2/tools/miniconda3_gamma/envs/qiime2-2019.4/lib/python3.6/site-packages/qiime2/sdk/action.py", line 231, in bound_callable
    output_types, provenance)
  File "/data/beta-proteomics2/tools/miniconda3_gamma/envs/qiime2-2019.4/lib/python3.6/site-packages/qiime2/sdk/action.py", line 393, in _callable_executor_
    spec.qiime_type, output_view, spec.view_type, prov)
  File "/data/beta-proteomics2/tools/miniconda3_gamma/envs/qiime2-2019.4/lib/python3.6/site-packages/qiime2/sdk/result.py", line 271, in _from_view
    provenance_capture=provenance_capture)
  File "/data/beta-proteomics2/tools/miniconda3_gamma/envs/qiime2-2019.4/lib/python3.6/site-packages/qiime2/core/archive/archiver.py", line 316, in from_data
    Format.write(rec, type, format, data_initializer, provenance_capture)
  File "/data/beta-proteomics2/tools/miniconda3_gamma/envs/qiime2-2019.4/lib/python3.6/site-packages/qiime2/core/archive/format/v5.py", line 21, in write
    provenance_capture)
  File "/data/beta-proteomics2/tools/miniconda3_gamma/envs/qiime2-2019.4/lib/python3.6/site-packages/qiime2/core/archive/format/v1.py", line 26, in write
    prov_dir, [root / cls.METADATA_FILE, archive_record.version_fp])
  File "/data/beta-proteomics2/tools/miniconda3_gamma/envs/qiime2-2019.4/lib/python3.6/site-packages/qiime2/core/archive/provenance.py", line 313, in finalize
    self.write_citations_bib()
  File "/data/beta-proteomics2/tools/miniconda3_gamma/envs/qiime2-2019.4/lib/python3.6/site-packages/qiime2/core/archive/provenance.py", line 304, in write_citations_bib
    self.citations.save(str(self.path / self.CITATION_FILE))
  File "/data/beta-proteomics2/tools/miniconda3_gamma/envs/qiime2-2019.4/lib/python3.6/site-packages/qiime2/core/cite.py", line 71, in save
    bp.dump(db, f, writer=writer)
  File "/data/beta-proteomics2/tools/miniconda3_gamma/envs/qiime2-2019.4/lib/python3.6/site-packages/bibtexparser/__init__.py", line 111, in dump
    bibtex_file.write(writer.write(bib_database))
UnicodeEncodeError: 'latin-1' codec can't encode character '\u03b2' in position 3405: ordinal not in range(256)
mwang87 commented 4 years ago

issues with running are addressed now. @anupriyatripathi will rerun and side project is to continue evaluation of results.

mwang87 commented 4 years ago

Technical implementation seems to be fixed.

Research will continue out of band