Saber is a deep-learning based tool for information extraction in the biomedical domain. Pull requests are welcome! Note: this is a work in progress. Many things are broken, and the codebase is not stable.
This pull request addresses a couple of bugs that prevented transfer learning from working correctly, specifically:
When transferring, you need to map all words and characters in the target dataset that don't appear in the source dataset to unknown tokens. The alternative is to modify the neural networks architecture after it has been trained, which I do not know how to do.
When mapping types to unique integer IDs, the mapping was not necessarily a consecutive series of numbers from 0 onward. This causes issues when the targets were one-hot encoded. Fixed.
This pull request addresses a couple of bugs that prevented transfer learning from working correctly, specifically: