It's important for these to be implemented following best-practices from existing literature. Note that some of these may not lend themselves well to text data, in which case they might be implemented for image data instead (#6). There may also be some cases where both an image and text version for a particular characteristic will be good to have.
Would be neat to have options for rule-based disaggregators (e.g. word presence for pronouns) and ML-powered ones (e.g. using NLI to find pronouns). Maybe for overlapping ones they could exist as configurations within the same disaggregator module? It's important though that any ML-powered ones use optional extra dependencies.
Characteristic to consider include:
Ability StatusEthnicityIt's important for these to be implemented following best-practices from existing literature. Note that some of these may not lend themselves well to text data, in which case they might be implemented for image data instead (#6). There may also be some cases where both an image and text version for a particular characteristic will be good to have.
Would be neat to have options for rule-based disaggregators (e.g. word presence for pronouns) and ML-powered ones (e.g. using NLI to find pronouns). Maybe for overlapping ones they could exist as configurations within the same disaggregator module? It's important though that any ML-powered ones use optional extra dependencies.