Is your feature request related to a problem? Please describe.
Our annotations usually contain a lot of text that is actually categorical (e.g. speaker type, etc.)
We might save memory usage and some CPU time by using pandas Categorical data type.
There are, however, a number of pitfalls. So we should be very careful with this.
This may be something we could make an option...
Describe the solution you'd like
Assess the impact of Categorical on the performance on realistic data.
Is your feature request related to a problem? Please describe.
Our annotations usually contain a lot of text that is actually categorical (e.g. speaker type, etc.) We might save memory usage and some CPU time by using pandas Categorical data type.
There are, however, a number of pitfalls. So we should be very careful with this. This may be something we could make an option...
Describe the solution you'd like
Assess the impact of Categorical on the performance on realistic data.