Azure-Samples / azure-search-openai-demo-java

This repo is the Java version of Microsoft's sample app for ChatGPT + Enterprise data.
MIT License
67 stars 66 forks source link

Add ideographic and full-width punctuation to splitter #77

Closed tonybaloney closed 3 months ago

tonybaloney commented 3 months ago

Purpose

Better support CJK languages with fullwidth and ideographic unicode punctuations

Does this introduce a breaking change?

[ ] Yes
[x] No

Pull Request Type

What kind of change does this Pull Request introduce?

[ ] Bugfix
[x] Feature
[ ] Code style update (formatting, local variables)
[ ] Refactoring (no functional changes, no api changes)
[ ] Documentation content changes
[ ] Other... Please describe:

How to Test

git clone [repo-address]
cd [repo-name]
git checkout [branch-name]
npm install

What to Check

Verify that the following are valid

Other Information