Chinese model and content support

Hi, you should be able to invoke any SageMaker-deployed endpoint using our SageMakerModelRunner.

Currently, you can use our QA Accuracy evaluation algorithm with Chinese text, but the metrics that are computed may not match what you expect, mainly because Chinese text tends not to have spaces in it, like English does.

For example if the model_output is "学中文" and the target_output is "中文", the _f1_score function in our QA Accuracy code will output 0.0. This is because _f1_score (and other functions, like _precision) internally call _split, which splits text on whitespace. Since there is no whitespace in either of these pieces of text, _split("学中文") will be the set {"学中文"}, while _split("中文") will be the set {"中文"}. There is no overlap between these sets at all, so the computed metric is 0.0.

aws / fmeval

Chinese model and content support #202