Retraining models - Githubissues

smartschat / cort

A toolkit for coreference resolution and error analysis.

MIT License

129 stars 34 forks source link

Retraining models #11

Open chaitjo opened 8 years ago

chaitjo commented 8 years ago

Is it possible to retrain models (for example, the one's from https://github.com/smartschat/cort/blob/master/COREFERENCE.md#model-downloads) with new data?

I tried training using-

cort-train -in new_retraining_data.conll \
           -out pretrained_model.obj \
           -extractor cort.coreference.approaches.mention_ranking.extract_substructures \
           -perceptron cort.coreference.approaches.mention_ranking.RankingPerceptron \
           -cost_function cort.coreference.cost_functions.cost_based_on_consistency \
           -n_iter 5 \ 
           -cost_scaling 100 \
           -random_seed 23

but I think it overwrites the model.

smartschat commented 8 years ago

If I understand you correctly, you want to take a model as input, and then initialize training on new data with that model. Is that what you want to do?

chaitjo commented 8 years ago

Yes. For example, take http://smartschat.de/downloads/model-pair-train.obj and continue training on it using some new data.

smartschat commented 8 years ago

That's not implemented, but the code can be adapted. Unfortunately, I will not be able to have a closer look at this during this week. If you want to do it by yourself, I can give you some pointers.

chaitjo commented 8 years ago

Please do!

I'll create a pull request if I am able to successfully implement this.

smartschat commented 8 years ago

The constructor of perceptrons.pyx has priors and weights parameters. However, for training, these are overwritten in the fit method.

You need to make this overwriting optional, for example by adding a boolean parameter which controls whether weights/priors should be initialized or not. Then you also need to adapt the experiments.py-API training/predicting scripts with the new parameters.

I hope this helps. if you have any questions, I'm happy to answer them!

rakesh-malviya commented 6 years ago

Hi Sebastian,

Can you give approximate information on the amount of time it took you to train CORT on CONLL data? What was the hardware you used ?

Thanks and regards, Rakesh Malviya

smartschat commented 6 years ago

Hi Rakesh,

training the ranking model takes around two minutes per epoch. Preprocessing takes ~ 20 minutes if I remember correctly. Due to high memory requirements I train the models on a server with > 100GB RAM, using ~20 2.3GHz CPUs. However, only preprocessing is parallelized.