Closed michael-conrad closed 2 years ago
Correct, there are 42 features in our own lookup and then 24 more that we get from a resource called Panphon, which also encodes phonemes as vectors. For some symbols, this other resource behaved strangely, so I added those checks to figure out why the panphon vectors sometimes have different numbers of dimensions. When I rework the featurevectors for tone and lenthening, I'm probably going to remove the panphon vectors completely for simplicity.
I've started using constants for these, and pass this constant to the models for the matching dim parameter.
Is that correct?
You mean the number of dimensions in a featurevector is constant? If so then yes. If you use different featurevectors with a different amount of dimensions, then you can just replace all those with the new expected amount of dimensions.
I'm wanting to at some point test converting the transcript text directly to byte values, would simply setting the feature entries up as {"symbol_type", "byte", "b0": 0/1, "b1": 0/1, ...} as a direct conversion from bytes work?
I don't se a reason why it wouldn't work, although I think there could be an easier way by not using the articulatory vector pipeline at all and converting the sequence of bytes directly to a sequence of LongTensors.
I was testing adding tones and lengths as features and discovered that there is a hard coded check for '66' features, the number appears to be hard coded without explanation in multiple locations.
Would it be safe to assume that any '66' is for the feature count checking only?