Open xhluca opened 1 year ago
Hi @xhluca,
Sorry for the late reply.
Is it just the issue of Tevatron/wikipedia-wq-corpus
? Tevatron/wikipedia-nq-corpus
also not works?
It seems like a issue caused by the json environment?
data = json.loads(line)
File "/opt/conda/lib/python3.7/json/__init__.py", line 348, in loads
return _default_decoder.decode(s)
File "/opt/conda/lib/python3.7/json/decoder.py", line 337, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File "/opt/conda/lib/python3.7/json/decoder.py", line 353, in raw_decode
obj, end = self.scan_once(s, idx)
let me know if you still having the issue
Xueguang
I'm not sure what json environment means here. I'm using the standard python 3.7 library in a fresh virtualenv
I tried different datasets and the problem seems to be present
Could you see if a simple jsonl file can be read in your environment? or could you try conda environment? My environment is python3.8 with conda
Yes, I tried the following example: https://stackoverflow.com/questions/50475635/loading-jsonl-file-as-json-objects
ANd it works fine in my environment
@MXueguang My bad, I was indeed using conda. However, do you think there should be a difference whether I"m using conda or virtualenv since the libraries were installed with pip and there's no conda-specific dependency?
I have first used tevatron to train DPR from bert-based-uncased:
After the model was saved to
model_wq/
(see footnote), I continued to follow the instructions to encode the passages:I saved that inside a bash file and ran the bash file, but I multiple
JSONDecodeError
along the way, which does not seem to be expected (which is why I stopped the process):Is this normal?
Libraries
This is my requirements file:
Footnote
model_nq
but renamed it tomodel_wq
, I don't think this makes a difference but if it does let me know.master
and also with the 0.1 version on pypi and I'm getting the same error.