kermitt2 / delft

a Deep Learning Framework for Text https://delft.readthedocs.io/
Apache License 2.0
388 stars 64 forks source link

Fix --output option to dump the model in a specific directory #124

Open lfoppiano opened 2 years ago

lfoppiano commented 2 years ago

I've implemented this some time ago and it worked relatively well for normal models, however for transformers we realised it was not taken in account. With the new update to TF2 it seems to work fine for sequenceLabelling, except for the n-fold cross-validation.

In particular the problem seems to be related to the fact that the store of the model is hidden within the cross-validation. In my opinion we should call the model.save() after the n-fold cross-validation which will save either just one model (the best) or all of them (e.g. in case of ensemble).

My proposal is to give the wrapper a working/temporary directory and then explicitly save the model using model.save() and passing either the --output path or the default path within the data/models.

lfoppiano commented 2 years ago

My suggestion would the following:

With this approach the wrapper will just need to know the tmp directory and if something fails delft will not pollute the data/ directory

What do you think?

kermitt2 commented 2 years ago

I think to clarify, there is no problem currently with --output option afaik.

The dir_path parameter in save() for the two wrappers is working fine in n-fold usage, both with and without transformer, in previous version and current one. The --output parameter in grobidTagger.py is working too - this is the only "applications" script using the --output parameter (it can be used to save directly a Grobid model under grobid home, with the right compatible Grobid model name).

Using the tmp folder will modify the saving mechanism to address #126, but the --output option is working without it in my tests.

lfoppiano commented 2 years ago

Maybe the current version is good now.

When I opened this issue it was the version 0.2.6 I think and at that time it was not working only in the case of scibert + 10fold or training (I can't remember exactly) because only config and preprocessor were saved correctly in the output directory, while the rest was saved in the default directory data/models/....

Since there is a new version we can close it from my side 🎉