pommedeterresautee / fastrtext

R wrapper for fastText
https://pommedeterresautee.github.io/fastrtext/
Other
101 stars 15 forks source link

Add autotune option #38

Open alanault opened 4 years ago

alanault commented 4 years ago

Hi there,

I saw on the Fasttext page here they've added an autotune feature, which automatically optimizes the various hyperparameters.

Seems it can be activated with the -autotune-validation option, which isn't currently supported. Wondered if this could be added with the updates for CRAN?

https://fasttext.cc/docs/en/autotune.html

best

Alan

pommedeterresautee commented 4 years ago

Hi,

Will work on it since I need to fix something else to be back on Cran... Hope next week I will have some time to work on it.

alanault commented 4 years ago

Sounds great - intrigued to see the results of this output

alanault commented 4 years ago

Hi there, just wondered how this was coming on?

I'm not familiar with Rcpp, but happy to help on any R-based areas if that's useful?

pommedeterresautee commented 4 years ago

Hello, I have updated the C++ code. Should work using command line. Do you think it would make sense to have a R API for this?

alanault commented 4 years ago

Hi: that's fantastic!

Yes, I was thinking it could just be exactly the same as the other calls, just with the -autotune-validation passed as an option in the execute command, along with the validation file.

So, to use the supervised example: execute(commands = c("supervised", "-input", train_tmp_file_txt, "-output", tmp_file_model, "-autotune-validation", valid_tmp_file_text))

That way everything is consistent within the execute command?

pommedeterresautee commented 4 years ago

I think so. Not yet tried. If you check, can you let me know if it works?

alanault commented 4 years ago

Installed 0.3.4 So the call seems to work just fine... however each time I try it I get a C stack usage crash (is too close to the limit).

I'm only training on 100k short sentences and it crashes within seconds at (0.8% of completion), so wonder if something else is going on?

pommedeterresautee commented 4 years ago

does it crash in other situations?

alanault commented 4 years ago

No - if I remove the -autotune-validation argument and just make the same call train, then it runs just fine on the same data, so must be related to the validation...

alanault commented 4 years ago

Did a test using the command line version and this worked fine, so the issue must be somewhere inside the rcpp wrapper

alanault commented 4 years ago

Any luck with this?