[Doc] Advanced usage example for Sequence Labeling in NLP

automl / SMAC3

SMAC3: A Versatile Bayesian Optimization Package for Hyperparameter Optimization

https://automl.github.io/SMAC3/v2.1.0/

Other

1.08k stars 224 forks source link

[Doc] Advanced usage example for Sequence Labeling in NLP #851

Closed jd-coderepos closed 1 year ago

jd-coderepos commented 2 years ago

Dear AutoML developers,

I wanted to point out that there is really great NLP tool https://github.com/jiesutd/NCRFpp which provides a mere config file as an interface to setup various sequence labeling architectures. It might be of great interest, in my view, to NLP practioners and users of NCRF++ to see a clear usage example of how AutoML could make the hyperparameter tuning a less cumbersome task once a user decides on a specific neural network architecture in NCRF++.

@jiesutd

Cheers, Jennifer

eddiebergman commented 2 years ago

Hi @jd-coderepos,

Thanks for pointing us to that and showing your interest in using SMAC for NLP hyperparamter tuning! Two follow up questions:

Are you hoping to just have a config + cli based interface to using SMAC?
Is it specifically for NLP and sequence labelling your looking for an example?

I ask the second one because it's very hyper-specific but if you could elaborate on your desired work-flow then maybe turn this into a more tangible goal that makes it more intuitive :)

Best, Eddie

jd-coderepos commented 2 years ago

Hello @eddiebergman ,

elaborate on your desired work-flow then maybe turn this into a more tangible goal that makes it more intuitive :)

yes, I am happy to try to do this.

I use the demo.train.config for specifying the sequence labeling architecture.

###NetworkConfiguration###
use_crf=True
use_char=True
word_seq_feature=LSTM
char_seq_feature=CNN

lstm_layer=1
bilstm=True

for the architecture above, i.e. a BiLSTM-CNN-CRF sequence labeler, would it possible to illustrate a working example of how AutoML could best be leveraged to obtain optimally finetuned hyperparameters

cnn_layer=
char_hidden_dim=
hidden_dim=
dropout=
learning_rate=
lr_decay=
momentum=
l2=

cheers!

ps: I am happy to clarify further as needed.

alexandertornede commented 1 year ago

Hi,

we just looked into this again and it seems that this is a very specific use case. We would be glad to include an example, which you provide via a pull request.

However, for now, we will close the issue. If you plan on doing a pull request and have any question, feel free to open the issue again.