unispeech / unimrcp

Open source cross-platform implementation of MRCP protocol
http://www.unimrcp.org
Apache License 2.0
379 stars 167 forks source link

The confidence attribute in interpretation should be an integer from 0-100 #231

Closed godhand4826 closed 5 years ago

godhand4826 commented 5 years ago

I'm studying the NLSML(Natural Language Semantics Markup Language). I found there is a little bit weird. Here is the result file of the recog demo in this project. https://github.com/unispeech/unimrcp/blob/7794d214c469ec3ab08d58b84880630c41144f31/data/result.xml#L3 Due to the w3_spec it should be "97" instead of "0.97". Correct me if I'm wrong. Have a good day.

michaelplevy commented 5 years ago

The document you are referencing is a working draft. In "Status of this Document" it says:

It is a draft document and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use W3C Working Drafts as reference material or to cite them as other than "work in progress".

I believe the MRCP v2 group took the work that had been done on NLSML and incorporated it into their own work. They took a snapshot of the NLSML work and incorporated it into RFC-6787. I don't believe the W3C continued to work on NLSML so the MRCP team had to take their own fork of a version of NLSML. I believe their intent was to eventually move everyone to support EMMA, but for MRCP V2, RFC-6787 defines and maintains its own NLSML schema.

Notice that RFC-6787 in section 6.3 says:

6.3. Generic Result Structure The Natural Language Semantics Markup Language (NLSML), an XML markup based on an early draft from the W3C, is the default standard for returning results back to the client. Hence, all servers implementing these resource types MUST support the media type 'application/nlsml+xml'.

RFC-6787 in section 6.3.1 says:

6.3.1. Natural Language Semantics Markup Language

The Natural Language Semantics Markup Language (NLSML) is an XML data structure with elements and attributes designed to carry result information from recognizer (including enrollment) and verifier resources. The normative definition of NLSML is the RelaxNG schema in Section 16.1.

RFC-6787 incorporates its own NLSML schema in section 16.1. In that schema, confidence is a float between 0.0 and 1.0.

Now for MRCP V2, I believe a float confidence value is correct. However, we have found that some vendors' products treat NLSML confidence as an integer between 0 and 100. This may be due to attempting to be compatible with the NLSML working draft or their implemetnation carried forward the formats from MRCP V1.

godhand4826 commented 5 years ago

Ok, I got it. Thanks.