Closed thaisnang closed 4 years ago
Hi @thaisnang, could you provide more details on what code/tutorial you're running and the full error stack trace that you get?
I ran tutorial 1 with FARMReader. It ran the first time then I tried again but this time with the offline model (same model base RoBERTa). After that it always gives the following:-
05/19/2020 14:16:31 - INFO - elasticsearch - PUT http://localhost:9200/document [status:400 request:0.004s]
05/19/2020 14:16:31 - INFO - haystack.indexing.io - Found data stored in data/article_txt_got
. Delete this first if you really want to fetch new data.
05/19/2020 14:16:31 - INFO - elasticsearch - POST http://localhost:9200/_count [status:200 request:0.536s]
05/19/2020 14:16:52 - INFO - elasticsearch - POST http://localhost:9200/_bulk [status:200 request:1.665s]
05/19/2020 14:16:53 - INFO - elasticsearch - POST http://localhost:9200/_bulk [status:200 request:0.399s]
05/19/2020 14:16:53 - INFO - haystack.indexing.io - Wrote 517 docs to DB
05/19/2020 14:16:53 - INFO - farm.utils - device: cuda n_gpu: 1, distributed training: False, automatic mixed precision training: None
05/19/2020 14:17:04 - WARNING - farm.modeling.language_model - Could not automatically detect from language model name what language it is.
We guess it's an ENGLISH model ...
If not: Init the language model by supplying the 'language' param.
Traceback (most recent call last):
File "Tutorial1_Basic_QA_Pipeline.py", line 123, in
I tried with transformer as well it gives the following--
05/19/2020 14:22:30 - INFO - elasticsearch - PUT http://localhost:9200/document [status:400 request:0.036s]
05/19/2020 14:22:30 - INFO - haystack.indexing.io - Found data stored in data/article_txt_got
. Delete this first if you really want to fetch new data.
05/19/2020 14:22:30 - INFO - elasticsearch - POST http://localhost:9200/_count [status:200 request:0.004s]
05/19/2020 14:22:30 - INFO - haystack.indexing.io - Skip writing documents since DB already contains 517 docs ... (Disable only_empty_db
, if you want to add docs anyway.)
05/19/2020 14:22:38 - INFO - elasticsearch - POST http://localhost:9200/document/_search [status:200 request:0.318s]
05/19/2020 14:22:38 - INFO - haystack.retriever.elasticsearch - Got 10 candidates from retriever
05/19/2020 14:22:38 - INFO - haystack.finder - Reader is looking for detailed answer in 362347 chars ...
convert squad examples to features: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00, 7.14it/s]
add example index and unique id: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00, 5629.94it/s]
Traceback (most recent call last):
File "Tutorial1_Basic_QA_Pipeline.py", line 140, in
Note:- I did not download the RoBERTa separately I just renamed the files from the cache it automatically downloaded. I have renamed them properly that I am sure of. Hopefully, this is not affecting it.
Hi @thaisnang, by default, the models are cached and are not re-downloaded on every execution. If that doesn't fit your workflow, I am curious to know more on how you plan to use the save(offline) functionality.
Here's how you can save the model of a FARMReader
:
reader.inferencer.save("path-to-save")
and load it again by supplying the path:
reader = FARMReader(model_name_or_path="path-to-save")
Actually I saw the model was downloading again when I ran it the second time. So I thought instead of downloading every execution why don't I just copy the cached model and properly rename it and use it as an offline model. And that's what I did, it should not interfere with the function right?
OK, I downloaded again and this time the model did not redownload it was using the cached model. And the model was saved as well. Thanks.
For some reason, the config file is not getting dumped in the folder. I have tried changing folder permissions but no help.