MontrealCorpusTools / PolyglotDB

Language data store and linguistic query API
MIT License
38 stars 14 forks source link

Influxdb error on writing CSV #158

Closed james-tanner closed 5 years ago

james-tanner commented 5 years ago

In running formant tracks for Hastings, I get the following output whilst writing to CSV:

Traceback (most recent call last):
  File "formant_track.py", line 130, in <module>
    corpus_conf['speakers'], vowel_prototypes_path = vowel_prototypes_path)
  File "formant_track.py", line 85, in formant_track_export
    q.to_csv(csv_path)
  File "/data/iscan-spade-server/src/polyglotdb/polyglotdb/query/annotations/query.py", line 404, in to_csv
    r.to_csv(path, mode=mode)
  File "/data/iscan-spade-server/src/polyglotdb/polyglotdb/query/annotations/results.py", line 227, in to_csv
    super(QueryResults, self).to_csv(path, mode=mode)
  File "/data/iscan-spade-server/src/polyglotdb/polyglotdb/query/base/results.py", line 122, in to_csv
    save_results(self.rows_for_csv(), path, header=self.columns, mode=mode)
  File "/data/iscan-spade-server/src/polyglotdb/polyglotdb/io/exporters/csv.py", line 50, in save_results
    for line in results:
  File "/data/iscan-spade-server/src/polyglotdb/polyglotdb/query/annotations/results.py", line 212, in rows_for_csv
    for line in self:
  File "/data/iscan-spade-server/src/polyglotdb/polyglotdb/query/base/results.py", line 111, in __iter__
    r = self._sanitize_record(r)
  File "/data/iscan-spade-server/src/polyglotdb/polyglotdb/query/annotations/results.py", line 196, in _sanitize_record
    data = self.corpus.get_utterance_acoustics(a.attribute.label, utterance_id, discourse, speaker)
  File "/data/iscan-spade-server/src/polyglotdb/polyglotdb/corpus/audio.py", line 692, in get_utterance_acoustics
    result = client.query(query)
  File "/home/linguistics/jtanner/miniconda3/lib/python3.6/site-packages/influxdb/client.py", line 416, in query
    expected_response_code=expected_response_code
  File "/home/linguistics/jtanner/miniconda3/lib/python3.6/site-packages/influxdb/client.py", line 286, in request
    raise InfluxDBClientError(response.content, response.status_code)
influxdb.exceptions.InfluxDBClientError: 400: {"error":"error parsing query: found s, expected ; at line 4, char 48"}

I tried to do some digging online as to what this would mean, but it's still pretty opaque to me.

msonderegger commented 5 years ago

@james-tanner I think for debugging the exact query being error'ed on will be needed. If you're running on oka, I think (?) you can see the exact queries sent in the celery (?) log -- could you post here?

(On roquefort I'm less sure what you'd do. You can consult MG on Slack for faster response on how to see the actual queries being run.)

james-tanner commented 5 years ago

I'm not sure where to find this. the influxdb.log looks fine, and I can't see anywhere where celery logs would be stored (the screen running celery hasn't printed anything new). I've checked the polyglot_data directory to make sure the corpus itself has been parsed correctly, and it looks fine (each file has multiple utterances, phones, words, etc).

Any advice about where to look for debugging this?

msonderegger commented 5 years ago

Nope, no idea. I'd ping @mmcauliffe about this on Slack iscan-dev.

mmcauliffe commented 5 years ago

Was an issue with not escaping apostrophes in speaker names, fixed now.