Closed meshy closed 8 years ago
This would be useful for people that know the encoding that they are likely to encounter, and help to reduce avoid false positives in the detection.
I think the way that the decoding is being done at the moment is too early, and shouldn't decode the whole raw message in one go. The encoding of the command, prefix, suffix, and params should probably be separated. I suspect that all but the suffix can be assumed to be ascii.
To dynamically decide upon an decoding strategy (different settings for expected encoding per channel, for example), we need to decode the command, prefix, and params before the suffix. This means that they can be used to inform/dictate the strategy needed to decode the suffix.
This may be related to #3.
The encoding of the command, prefix, suffix, and params should probably be separated.
This has now been done in a882a7235869cd22c78a70ac00c3d3f5fcfb6a2f.
to_unicode
should optionally take a default encoding. We could then fall back to the existing encoding detection code that we already have.