I'm trying to create a dense representations from my corpus and search paragraphs/phrases by keywords or a question. I don't have labeled Questions and Answers and I don't need for now to get answers, just retrieve documents possibly containing the answer.
I build a JSON with my corpus (pt-br) like this:
{
"data": [
{
"title": "Radicais livres: o que são, efeitos no corpo e como se proteger",
"paragraphs": [
{
"context": "Os radicais livres ...""
},
{
"context": "Desta forma, quanto menos radicais livres, ..."
}, ...
Those commads looks like working fine. Here the contents of output_dir
Now, when I try to use the model:
model = DensePhrases(
load_dir='princeton-nlp/densephrases-multi',
dump_dir='./data/densephrases-multi_sample/dump/',
index_name='start/128_flat_OPQ96'
)
This error raises:
>>>
This could take up to 15 mins depending on the file reading speed of HDD/SSD
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/projetos/u4vn/DensePhrases/densephrases/model.py", line 52, in __init__
self.truecase = TrueCaser(os.path.join(os.environ['DATA_DIR'], self.args.truecase_path))
File "/projetos/u4vn/DensePhrases/densephrases/utils/data_utils.py", line 366, in __init__
with open(dist_file_path, "rb") as distributions_file:
FileNotFoundError: [Errno 2] No such file or directory: './data/truecase/english_with_questions.dist'
I guess it's because of the env $DATA_DIR configuration. This kind of error raised to me when config.sh didn't work properly.
Whenever it happens, I just execute config.sh, and it works fine.
HI,
I'm trying to create a dense representations from my corpus and search paragraphs/phrases by keywords or a question. I don't have labeled Questions and Answers and I don't need for now to get answers, just retrieve documents possibly containing the answer.
I build a JSON with my corpus (pt-br) like this:
then I ran the following commands:
Those commads looks like working fine. Here the contents of output_dir
Now, when I try to use the model:
This error raises:
What am I missing? What file is this?