A major architectural difference between current flex (morning 2022-04-14) and IMN (latter of which gives very high score of .63/.65 on absa) is the split cnn for the first layer of shared components.
Recalling that IMN architectures w/ more than a single shared layer gave much higher results, it can be hypothesized that split_cnns have some affect on model performance.
Therefore, it was deemed necessary to update flex model to also handle split cnns.
When first building for this capability, consideration of how many shared layers should be able to split their cnns.
Addition, task-wise splits may also be desirable, and could be double checked as well.
Due to time restraints, the first implementation should focus only on mimicking IMN, i.e. only first layer of shared gets split.
Split cnn
A major architectural difference between current flex (morning 2022-04-14) and IMN (latter of which gives very high score of .63/.65 on absa) is the split cnn for the first layer of shared components.
Recalling that IMN architectures w/ more than a single shared layer gave much higher results, it can be hypothesized that split_cnns have some affect on model performance. Therefore, it was deemed necessary to update flex model to also handle split cnns.
When first building for this capability, consideration of how many shared layers should be able to split their cnns. Addition, task-wise splits may also be desirable, and could be double checked as well. Due to time restraints, the first implementation should focus only on mimicking IMN, i.e. only first layer of shared gets split.