Closed AinaIanemahy closed 7 months ago
Also see previous issue #36
Could we run the integration tests with different batch sizes to see whether it has an impact on the performance?
@Garrafao Is running such integration test is enough for testwug dataset, or we should run on all?
Mhhh… If we want a realistic test, I would say on one of the larger data sets too, maybe DWUG DE?
Here are results with various batch-sizes:
Annotator Data accuracy correlation p-value batch_size XL-Lexeme-Binary dwug_de 0.778 0.516 0.0 8
Annotator Data accuracy correlation p-value batch_size XL-Lexeme-Binary dwug_de 0.778 0.516 0.0 16
Annotator Data accuracy correlation p-value batch_size XL-Lexeme-Binary dwug_de 0.778 0.516 0.0 32
Annotator Data accuracy correlation p-value batch_size XL-Lexeme-Binary dwug_de 0.778 0.516 0.0 64
Annotator Data accuracy correlation p-value batch_size XL-Lexeme-Binary dwug_de 0.778 0.516 0.0 128
Annotator Data accuracy correlation p-value batch_size XL-Lexeme-Binary dwug_de 0.778 0.516 0.0 256
Annotator Data accuracy correlation p-value batch_size XL-Lexeme-Binary dwug_de 0.778 0.516 0.0 512
No variance in performance. Hence this issue is being closed.
We should decide whether we want to optimize the batch size parameter. The default value given by pierluigi is 32.