Batch size & Shuffle Dataloader & Train-dev-test split & sample_num

yuzhimanhua / Multi-BioNER

Cross-type Biomedical Named Entity Recognition with Deep Multi-task Learning (Bioinformatics'19)

Apache License 2.0

132 stars 28 forks source link

A few other questions:

For the batch sizes of dataloaders, why do you use 10 for training and 50 for evaluating devel and test set?
In your train_wc.py, you set shuffle=True for dataset_loader for each dataset. So when I shuffle the each dataset's dataloader, does it shuffle only at the document-level or the sentences within each document as well?
Why did you merge the train and devel biomedical dataset for training? Doesn't the model overfit? I assume you have experimented training on just the train.tsv, and the F1s are lower than using merge.tsv on the test set?
On line 246 and 251 in train_wc.py, what is the point of sample_num and the for loop with range(1)? (if it is always 1)

yuzhimanhua / Multi-BioNER