Open j6mes opened 5 years ago
@j6mes Have you ever figured out how to retrain the system? I'm trying to get it working on Czech wiki, but it's very unclear how to move forward.
No - I never needed to go through and retrain the entire system. For me, i got best value out of just putting it into a docker image and calling it as a black-box.
Perhaps @easonnie could advise on how to retrain the system?
@j6mes Thanks for the reply, I have played around with your fork quite a lot, so thanks for the cleaned up version. Hopefully @easonnie finds the time to advise on retraining.
@MichalPitr I had previously experimented with training their sentence retrieval and verification models. I do not have a compact version of the training code at the moment. I will just give you some quick steps and I think it is somewhat easy to figure out the rest.
default_steps
variable. Steps to be executed are from s1.tokenizing
to s2.2.1doc_nn_retri
. (Use the output files for rest of the training steps)train_fever_v1
from the file src/sentence_retrieval/simple_nnmodel.pytrain_fever_v1_advsample
from the file src/nli/mesim_wn_simi_v1_2.py Let me know if you get stuck somewhere!
@ShyamSubramanian Thanks, that's really useful. I am especially interested in getting the document retrieval working on my Czech wiki database, but the auto_pipeline.py
uses a file id_dict.json
that I haven't figured out how to generate using the code.
Hi, is it possible to re-train the system with different data? What scripts do I need to run to do this? There seems a lot of python files and I'm not sure which ones to call.