nlc.data.generator.js method stripSpecial is problematic, b/c it will strip anything that's not a character the set of ascii chars used in words. That includes unicode characters beyond ascii which might hurt us later on.
This was introduced b/c service doc might have trademarks or other characters in the service name and stripSpecial would take care of this. For example, the doc for 'IBM Watson Dialog' some times calls it 'BM Watson™ Dialog'. And the trade mark would cause our string comparisons not to match.
The most flexible solution is to just set the doc_name attribute in server-data.json to 'IBM Watson Dialog, BM Watson™ Dialog'. This way our string compare will find both and we don't have to do the strip char logic.
as part of this we need to review our existing doc_names in service-data.json to see if any need to be adjust. then rerun the crawler and examine results.
nlc.data.generator.js method stripSpecial is problematic, b/c it will strip anything that's not a character the set of ascii chars used in words. That includes unicode characters beyond ascii which might hurt us later on.
This was introduced b/c service doc might have trademarks or other characters in the service name and stripSpecial would take care of this. For example, the doc for 'IBM Watson Dialog' some times calls it 'BM Watson™ Dialog'. And the trade mark would cause our string comparisons not to match.
The most flexible solution is to just set the doc_name attribute in server-data.json to 'IBM Watson Dialog, BM Watson™ Dialog'. This way our string compare will find both and we don't have to do the strip char logic.