I've implemented a caching approach for the database similarity matrix when using DBScheduler -- the current version calls SpaCy every step which is inefficient. If you do this once at the start of training you get a big speedup during BERT- based training (at least from what I have observed).
This might also help with Issue #11
Let me know if this is helpful or if you want to move this to the not-public version of the code.
Hi Bailin,
I've implemented a caching approach for the database similarity matrix when using DBScheduler -- the current version calls SpaCy every step which is inefficient. If you do this once at the start of training you get a big speedup during BERT- based training (at least from what I have observed).
This might also help with Issue #11
Let me know if this is helpful or if you want to move this to the not-public version of the code.
Thanks, Tom