metagenome-atlas / atlas_analyze

Scripts to get the most of the output of metagenome-atlas
5 stars 1 forks source link

Issue in atlas_analyze.py at import_files step #4

Closed boaty closed 3 years ago

boaty commented 3 years ago

Hi Silas,

I got this error while running analyze.py. Error pop up at at import_files step.

I am also wondering that is it ok to stop here, because the programe gave already all the output ineed (taxonomy.tsv, mapping_rate.tsv, genome_completeness.tsv, counts/raw_counts_genomes.tsv, counts/median_coverage_genomes.tsv, annotations/KO.tsv, annotations/CAZy.tsv, annotations/KO.tsv,annotations/CAZy.tsv)

Thank you.

Zhou

--------------------------------------error----------------------- Job counts: count jobs 1 import_files 1 [Mon Feb 8 14:47:37 2021] Finished job 5. 3 of 6 steps (50%) done Select jobs to execute...

[Mon Feb 8 14:47:37 2021] localrule analyze: input: Results/taxonomy.tsv, Results/mapping_rate.tsv, Results/genome_completeness.tsv, Results/counts/raw_counts_genomes.tsv, Results/counts/median_coverage_genomes.tsv, Results/annotations/KO.tsv, Results/annotations/CAZy.tsv, Results/annotations/KO.tsv, Results/annotations/CAZy.tsv log: Results/Code.ipynb jobid: 2

[NbConvertApp] ERROR | Notebook JSON is invalid: Additional properties are not allowed ('id' was unexpected)

Failed validating 'additionalProperties' in code_cell:

On instance['cells'][0]: {'cell_type': 'code', 'execution_count': None, 'id': 'talented-colors', 'metadata': {'tags': ['snakemake-job-properties']}, 'outputs': ['...0 outputs...'], 'source': '\n' '######## snakemake preamble start (automatically inserted, do ' 'n...'}

Traceback (most recent call last): File "/home/anaconda3/envs/analyze/bin/jupyter-nbconvert", line 11, in sys.exit(main()) File "/home/anaconda3/envs/analyze/lib/python3.7/site-packages/jupyter_core/application.py", line 254, in launch_instance return super(JupyterApp, cls).launch_instance(argv=argv, kwargs) File "/home/anaconda3/envs/analyze/lib/python3.7/site-packages/traitlets/config/application.py", line 845, in launch_instance app.start() File "/home/anaconda3/envs/analyze/lib/python3.7/site-packages/nbconvert/nbconvertapp.py", line 350, in start self.convert_notebooks() File "/home/anaconda3/envs/analyze/lib/python3.7/site-packages/nbconvert/nbconvertapp.py", line 524, in convert_notebooks self.convert_single_notebook(notebook_filename) File "/home/anaconda3/envs/analyze/lib/python3.7/site-packages/nbconvert/nbconvertapp.py", line 489, in convert_single_notebook output, resources = self.export_single_notebook(notebook_filename, resources, input_buffer=input_buffer) File "/home/anaconda3/envs/analyze/lib/python3.7/site-packages/nbconvert/nbconvertapp.py", line 418, in export_single_notebook output, resources = self.exporter.from_filename(notebook_filename, resources=resources) File "/home/anaconda3/envs/analyze/lib/python3.7/site-packages/nbconvert/exporters/exporter.py", line 181, in from_filename return self.from_file(f, resources=resources, kw) File "/home/anaconda3/envs/analyze/lib/python3.7/site-packages/nbconvert/exporters/exporter.py", line 199, in from_file return self.from_notebook_node(nbformat.read(file_stream, as_version=4), resources=resources, kw) File "/home/anaconda3/envs/analyze/lib/python3.7/site-packages/nbconvert/exporters/notebook.py", line 32, in from_notebook_node nb_copy, resources = super().from_notebook_node(nb, resources, kw) File "/home/anaconda3/envs/analyze/lib/python3.7/site-packages/nbconvert/exporters/exporter.py", line 143, in from_notebook_node nb_copy, resources = self._preprocess(nb_copy, resources) File "/home/anaconda3/envs/analyze/lib/python3.7/site-packages/nbconvert/exporters/exporter.py", line 318, in _preprocess nbc, resc = preprocessor(nbc, resc) File "/home/anaconda3/envs/analyze/lib/python3.7/site-packages/nbconvert/preprocessors/base.py", line 47, in call return self.preprocess(nb, resources) File "/home/anaconda3/envs/analyze/lib/python3.7/site-packages/nbconvert/preprocessors/execute.py", line 79, in preprocess self.execute() File "/home/anaconda3/envs/analyze/lib/python3.7/site-packages/nbclient/util.py", line 74, in wrapped return just_run(coro(*args, *kwargs)) File "/home/anaconda3/envs/analyze/lib/python3.7/site-packages/nbclient/util.py", line 53, in just_run return loop.run_until_complete(coro) File "/home/anaconda3/envs/analyze/lib/python3.7/asyncio/base_events.py", line 587, in run_until_complete return future.result() File "/home/anaconda3/envs/analyze/lib/python3.7/site-packages/nbclient/client.py", line 541, in async_execute cell, index, execution_count=self.code_cells_executed + 1 File "/home/anaconda3/envs/analyze/lib/python3.7/site-packages/nbconvert/preprocessors/execute.py", line 123, in async_execute_cell cell, resources = self.preprocess_cell(cell, self.resources, cell_index) File "/home/anaconda3/envs/analyze/lib/python3.7/site-packages/nbconvert/preprocessors/execute.py", line 146, in preprocess_cell cell = run_sync(NotebookClient.async_execute_cell)(self, cell, index, store_history=self.store_history) File "/home/anaconda3/envs/analyze/lib/python3.7/site-packages/nbclient/util.py", line 74, in wrapped return just_run(coro(args, **kwargs)) File "/home/anaconda3/envs/analyze/lib/python3.7/site-packages/nbclient/util.py", line 53, in just_run return loop.run_until_complete(coro) File "/home/anaconda3/envs/analyze/lib/python3.7/site-packages/nest_asyncio.py", line 98, in run_until_complete return f.result() File "/home/anaconda3/envs/analyze/lib/python3.7/asyncio/futures.py", line 181, in result raise self._exception File "/home/anaconda3/envs/analyze/lib/python3.7/asyncio/tasks.py", line 249, in __step result = coro.send(None) File "/home/anaconda3/envs/analyze/lib/python3.7/site-packages/nbclient/client.py", line 832, in async_execute_cell self._check_raise_for_error(cell, exec_reply) File "/home/anaconda3/envs/analyze/lib/python3.7/site-packages/nbclient/client.py", line 740, in _check_raise_for_error raise CellExecutionError.from_cell_and_msg(cell, exec_reply['content']) nbclient.exceptions.CellExecutionError: An error occurred while executing the following cell:

get most abundant genomes

counts_per_genome= relab.sum().sort_values() ax= counts_per_genome[-10:].plot.bar(figsize=(10,5))

_= ax.set_xticklabels(Labels.loc[counts_per_genome.index[-10:]]) ax.set_title('Most abundant genomes') ax.set_ylabel('Abundance [relab]')


TypeError Traceback (most recent call last)

in 2 3 counts_per_genome= relab.sum().sort_values() ----> 4 ax= counts_per_genome[-10:].plot.bar(figsize=(10,5)) 5 6 _= ax.set_xticklabels(Labels.loc[counts_per_genome.index[-10:]]) ~/anaconda3/envs/analyze/lib/python3.7/site-packages/pandas/plotting/_core.py in bar(self, x, y, **kwargs) 946 >>> ax = df.plot.bar(x='lifespan', rot=0) 947 """ --> 948 return self(kind="bar", x=x, y=y, **kwargs) 949 950 def barh(self, x=None, y=None, **kwargs): ~/anaconda3/envs/analyze/lib/python3.7/site-packages/pandas/plotting/_core.py in __call__(self, *args, **kwargs) 792 data.columns = label_name 793 --> 794 return plot_backend.plot(data, kind=kind, **kwargs) 795 796 def line(self, x=None, y=None, **kwargs): ~/anaconda3/envs/analyze/lib/python3.7/site-packages/pandas/plotting/_matplotlib/__init__.py in plot(data, kind, **kwargs) 60 kwargs["ax"] = getattr(ax, "left_ax", ax) 61 plot_obj = PLOT_CLASSES[kind](data, **kwargs) ---> 62 plot_obj.generate() 63 plot_obj.draw() 64 return plot_obj.result ~/anaconda3/envs/analyze/lib/python3.7/site-packages/pandas/plotting/_matplotlib/core.py in generate(self) 277 def generate(self): 278 self._args_adjust() --> 279 self._compute_plot_data() 280 self._setup_subplots() 281 self._make_plot() ~/anaconda3/envs/analyze/lib/python3.7/site-packages/pandas/plotting/_matplotlib/core.py in _compute_plot_data(self) 402 data = data._convert(datetime=True, timedelta=True) 403 numeric_data = data.select_dtypes( --> 404 include=[np.number, "datetime", "datetimetz", "timedelta"] 405 ) 406 ~/anaconda3/envs/analyze/lib/python3.7/site-packages/pandas/core/frame.py in select_dtypes(self, include, exclude) 3440 # the "union" of the logic of case 1 and case 2: 3441 # we get the included and excluded, and return their logical and -> 3442 include_these = Series(not bool(include), index=self.columns) 3443 exclude_these = Series(not bool(exclude), index=self.columns) 3444 ~/anaconda3/envs/analyze/lib/python3.7/site-packages/pandas/core/series.py in __init__(self, data, index, dtype, name, copy, fastpath) 312 data = data.copy() 313 else: --> 314 data = sanitize_array(data, index, dtype, copy, raise_cast_failure=True) 315 316 data = SingleBlockManager(data, index, fastpath=True) ~/anaconda3/envs/analyze/lib/python3.7/site-packages/pandas/core/internals/construction.py in sanitize_array(data, index, dtype, copy, raise_cast_failure) 710 value = maybe_cast_to_datetime(value, dtype) 711 --> 712 subarr = construct_1d_arraylike_from_scalar(value, len(index), dtype) 713 714 else: ~/anaconda3/envs/analyze/lib/python3.7/site-packages/pandas/core/dtypes/cast.py in construct_1d_arraylike_from_scalar(value, length, dtype) 1231 value = ensure_str(value) 1232 -> 1233 subarr = np.empty(length, dtype=dtype) 1234 subarr.fill(value) 1235 TypeError: Cannot interpret '' as a data type TypeError: Cannot interpret '' as a data type [Mon Feb 8 14:52:19 2021] Error in rule analyze: jobid: 2 log: Results/Code.ipynb (check log file(s) for error message) RuleException: CalledProcessError in line 82 of /home/Desktop/atlasTest/atlas_analyze/Snakefile: Command 'set -euo pipefail; jupyter-nbconvert --log-level ERROR --execute --output /data/GV009/GV009_035/Results/Code.ipynb --to notebook --ExecutePreprocessor.timeout=-1 /data/GV009/GV009_035/.snakemake/scripts/tmpntaqydkv.Analyis_genome_abundances.ipynb' returned non-zero exit status 1. File "/home/anaconda3/envs/analyze/lib/python3.7/site-packages/snakemake/executors/__init__.py", line 2340, in run_wrapper File "/home/Desktop/atlasTest/atlas_analyze/Snakefile", line 82, in __rule_analyze File "/home/anaconda3/envs/analyze/lib/python3.7/site-packages/snakemake/executors/__init__.py", line 568, in _callback File "/home/anaconda3/envs/analyze/lib/python3.7/concurrent/futures/thread.py", line 57, in run File "/home/anaconda3/envs/analyze/lib/python3.7/site-packages/snakemake/executors/__init__.py", line 554, in cached_or_run File "/home/anaconda3/envs/analyze/lib/python3.7/site-packages/snakemake/executors/__init__.py", line 2352, in run_wrapper Shutting down, this might take some time. Exiting because a job execution failed. Look above for error message Complete log: /data/GV009/GV009_035/.snakemake/log/2021-02-08T144532.032128.snakemake.log Traceback (most recent call last): File "/home/Desktop/atlasTest/atlas_analyze/analyze.py", line 21, in "snakemake " File "/home/anaconda3/envs/analyze/lib/python3.7/site-packages/snakemake/shell.py", line 213, in __new__ raise sp.CalledProcessError(retcode, cmd) subprocess.CalledProcessError: Command 'set -euo pipefail; snakemake -d . -j 1 -s /home/Desktop/atlasTest/atlas_analyze/Snakefile' returned non-zero exit status 1.
bhimbbiswa commented 3 years ago

Hi.

I also got similar error.

Building DAG of jobs...
Using shell: /bin/bash
Provided cores: 1 (use --cores to define parallelism)
Rules claiming more threads will be scaled down.
Job counts:
    count   jobs
    1   all
    1   convert_nb
    2
Select jobs to execute...

[Tue Feb  9 15:29:47 2021]
rule convert_nb:
    input: Results/Code.ipynb
    output: Results/Summary.html
    jobid: 1

[NbConvertApp] Converting notebook Results/Code.ipynb to html
Traceback (most recent call last):
  File "/home/bhimbiswa/anaconda3/envs/analyze/lib/python3.7/site-packages/nbformat/reader.py", line 14, in parse_json
    nb_dict = json.loads(s, **kwargs)
  File "/home/bhimbiswa/anaconda3/envs/analyze/lib/python3.7/json/__init__.py", line 348, in loads
    return _default_decoder.decode(s)
  File "/home/bhimbiswa/anaconda3/envs/analyze/lib/python3.7/json/decoder.py", line 337, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
  File "/home/bhimbiswa/anaconda3/envs/analyze/lib/python3.7/json/decoder.py", line 355, in raw_decode
    raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/bhimbiswa/anaconda3/envs/analyze/bin/jupyter-nbconvert", line 11, in <module>
    sys.exit(main())
  File "/home/bhimbiswa/anaconda3/envs/analyze/lib/python3.7/site-packages/jupyter_core/application.py", line 254, in launch_instance
    return super(JupyterApp, cls).launch_instance(argv=argv, **kwargs)
  File "/home/bhimbiswa/anaconda3/envs/analyze/lib/python3.7/site-packages/traitlets/config/application.py", line 845, in launch_instance
    app.start()
  File "/home/bhimbiswa/anaconda3/envs/analyze/lib/python3.7/site-packages/nbconvert/nbconvertapp.py", line 350, in start
    self.convert_notebooks()
  File "/home/bhimbiswa/anaconda3/envs/analyze/lib/python3.7/site-packages/nbconvert/nbconvertapp.py", line 524, in convert_notebooks
    self.convert_single_notebook(notebook_filename)
  File "/home/bhimbiswa/anaconda3/envs/analyze/lib/python3.7/site-packages/nbconvert/nbconvertapp.py", line 489, in convert_single_notebook
    output, resources = self.export_single_notebook(notebook_filename, resources, input_buffer=input_buffer)
  File "/home/bhimbiswa/anaconda3/envs/analyze/lib/python3.7/site-packages/nbconvert/nbconvertapp.py", line 418, in export_single_notebook
    output, resources = self.exporter.from_filename(notebook_filename, resources=resources)
  File "/home/bhimbiswa/anaconda3/envs/analyze/lib/python3.7/site-packages/nbconvert/exporters/exporter.py", line 181, in from_filename
    return self.from_file(f, resources=resources, **kw)
  File "/home/bhimbiswa/anaconda3/envs/analyze/lib/python3.7/site-packages/nbconvert/exporters/exporter.py", line 199, in from_file
    return self.from_notebook_node(nbformat.read(file_stream, as_version=4), resources=resources, **kw)
  File "/home/bhimbiswa/anaconda3/envs/analyze/lib/python3.7/site-packages/nbformat/__init__.py", line 143, in read
    return reads(buf, as_version, **kwargs)
  File "/home/bhimbiswa/anaconda3/envs/analyze/lib/python3.7/site-packages/nbformat/__init__.py", line 73, in reads
    nb = reader.reads(s, **kwargs)
  File "/home/bhimbiswa/anaconda3/envs/analyze/lib/python3.7/site-packages/nbformat/reader.py", line 58, in reads
    nb_dict = parse_json(s, **kwargs)
  File "/home/bhimbiswa/anaconda3/envs/analyze/lib/python3.7/site-packages/nbformat/reader.py", line 17, in parse_json
    raise NotJSONError(("Notebook does not appear to be JSON: %r" % s)[:77] + "...") from e
nbformat.reader.NotJSONError: Notebook does not appear to be JSON: ''...
[Tue Feb  9 15:29:56 2021]
Error in rule convert_nb:
    jobid: 1
    output: Results/Summary.html
    shell:
        jupyter nbconvert --output Summary --to=html --TemplateExporter.exclude_input=True Results/Code.ipynb
        (one of the commands exited with non-zero exit code; note that snakemake uses bash strict mode!)

Shutting down, this might take some time.
Exiting because a job execution failed. Look above for error message
Complete log: /lustre7/home/bhimbiswa/New_micro/.snakemake/log/2021-02-09T152945.506087.snakemake.log
Traceback (most recent call last):
  File "/lustre7/home/bhimbiswa/atlas_analyze/analyze.py", line 21, in <module>
    "snakemake "
  File "/home/bhimbiswa/anaconda3/envs/analyze/lib/python3.7/site-packages/snakemake/shell.py", line 213, in __new__
    raise sp.CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command 'set -euo pipefail;  snakemake -d /lustre7/home/bhimbiswa/New_micro -j 1 -s /lustre7/home/bhimbiswa/atlas_analyze/Snakefile' returned non-zero exit status 1.

Regards Bhim

SilasK commented 3 years ago

Ok, It's not so important. You have all the output files. Have a look at the `Results/Code.ipynb`` And the Rcode/Python code in https://github.com/metagenome-atlas/Tutorial

But I try to fix this soon.