nextstrain / augur

Pipeline components for real-time phylodynamic analysis
https://docs.nextstrain.org/projects/augur/
GNU Affero General Public License v3.0
268 stars 128 forks source link

Augur traits error #1626

Closed vanessazubach closed 2 months ago

vanessazubach commented 2 months ago

Current Behavior

Trying to run augur traits This was my code

augur traits \
  --tree results/tree.nwk \
  --metadata data/metadata.tsv \
  --output-node-data results/traits.json \
  --columns region country \
  --confidence

Errors:
Traceback (most recent call last):
  File "/home/vzubach/.nextstrain/runtimes/conda/env/lib/python3.11/site-packages/augur/__init__.py", line 66, in run
    return args.__command__.run(args)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/vzubach/.nextstrain/runtimes/conda/env/lib/python3.11/site-packages/augur/traits.py", line 134, in run
    traits = read_metadata(
             ^^^^^^^^^^^^^^
  File "/home/vv/.nextstrain/runtimes/conda/env/lib/python3.11/site-packages/augur/io/metadata.py", line 157, in read_metadata
    return pd.read_csv(
           ^^^^^^^^^^^^
  File "/home/vv/.nextstrain/runtimes/conda/env/lib/python3.11/site-packages/pandas/util/_decorators.py", line 211, in wrapper
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/home/vv/.nextstrain/runtimes/conda/env/lib/python3.11/site-packages/pandas/util/_decorators.py", line 331, in wrapper
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/home/vv/.nextstrain/runtimes/conda/env/lib/python3.11/site-packages/pandas/io/parsers/readers.py", line 950, in read_csv
    return _read(filepath_or_buffer, kwds)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/vv/.nextstrain/runtimes/conda/env/lib/python3.11/site-packages/pandas/io/parsers/readers.py", line 611, in _read
    return parser.read(nrows)
           ^^^^^^^^^^^^^^^^^^
  File "/home/vv/.nextstrain/runtimes/conda/env/lib/python3.11/site-packages/pandas/io/parsers/readers.py", line 1778, in read
    ) = self._engine.read(  # type: ignore[attr-defined]
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/vv/.nextstrain/runtimes/conda/env/lib/python3.11/site-packages/pandas/io/parsers/c_parser_wrapper.py", line 235, in read
    data = self._reader.read(nrows)
           ^^^^^^^^^^^^^^^^^^^^^^^^
  File "pandas/_libs/parsers.pyx", line 790, in pandas._libs.parsers.TextReader.read
  File "pandas/_libs/parsers.pyx", line 883, in pandas._libs.parsers.TextReader._read_rows
  File "pandas/_libs/parsers.pyx", line 1973, in pandas._libs.parsers.raise_parser_error
pandas.errors.ParserError: Error tokenizing data. C error: Expected 4 fields in line 127, saw 5
An error occurred (see above) that has not been properly handled by Augur.
To report this, please open a new issue including the original command and the error above:
    <https://github.com/nextstrain/augur/issues/new/choose>

Expected behavior

The 'results/traits.json' file was not created

How to reproduce

Steps to reproduce the current behavior:

  1. Open / run ...
  2. See error

Possible solution

(optional)

Your environment: if browsing Nextstrain online

Your environment: if running Nextstrain locally

Additional context

Add any other context about the problem here.

genehack commented 2 months ago

Based on this error:

pandas.errors.ParserError: Error tokenizing data. C error: Expected 4 fields in line 127, saw 5

I expect there's some sort of encoding or data layout issue on line 127 of your metadata.tsv file — would it be possible for you to share that file?

vanessazubach commented 2 months ago

Thank you for your help. Yes there was an extra space in line 127 causing the error.

Best, Vanessa

From: John SJ Anderson @.> Sent: Tuesday, September 10, 2024 12:22 PM To: nextstrain/augur @.> Cc: Zubach, Vanessa A (PHAC/ASPC) @.>; Author @.> Subject: Re: [nextstrain/augur] Augur traits error (Issue #1626)

Warning: This message is from an EXTERNAL SENDER - be CAUTIOUS, particularly with links and attachments. Attention: Ce message provient d'un EXPÉDITEUR EXTERNE - soyez PRUDENT, en particulier avec les liens et les pièces jointes.

Based on this error:

pandas.errors.ParserError: Error tokenizing data. C error: Expected 4 fields in line 127, saw 5

I expect there's some sort of encoding or data layout issue on line 127 of your metadata.tsv file - would it be possible for you to share that file?

- Reply to this email directly, view it on GitHubhttps://github.com/nextstrain/augur/issues/1626#issuecomment-2341541250, or unsubscribehttps://github.com/notifications/unsubscribe-auth/BKR6UN73DRKVVNSL5DWEASTZV4TDNAVCNFSM6AAAAABN7GGSLKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGNBRGU2DCMRVGA. You are receiving this because you authored the thread.Message ID: @.**@.>>