kolloldas / torchnlp

Easy to use NLP library built on PyTorch and TorchText
Apache License 2.0
254 stars 44 forks source link

Batch size stuck at 100 #10

Closed Alexmac22347 closed 5 years ago

Alexmac22347 commented 5 years ago

Hi, even when I try changing the hyperparameters like so:

from torchnlp.ner import *

h2 = hparams_transformer_ner()
h2.update(batch_size=10)

train('ner-conll2003-nocrf', TransformerTagger, conll2003, hparams=h2)

The batch.batch_size still is 100 (line 167 of train.py)(I added the print statement):

for batch in prog_iter:
    print(batch.batch_size)

Edit: I can see where the batch size is being set by default to 100, line 41 of torchnlp/ner.py.


conll2003 = partial(conll2003_dataset, 'ner',  hparams_tagging_base().batch_size,  
                                    root=PREFS.data_root,
                                    train_file=PREFS.data_train,
                                    validation_file=PREFS.data_validation,
                                    test_file=PREFS.data_test)

However not sure where it's supposed be updated to a custom value.

kolloldas commented 5 years ago

Yeah, the problem is that batch_size is a parameter that's required by the data pipeline. The way to go about it would be to remove the parameter from this call and set it inside the train() call in tasks/sequence_tagging/main.py I'll try giving an update by the weekend. But feel free to make a PR!

aleksas commented 5 years ago

Same issue here :/