UTF8 needs testing - Githubissues

rchillyard / The-repository-formerly-known-as

2 stars 12 forks source link

UTF8 needs testing #22

Closed rchillyard closed 4 years ago

rchillyard commented 4 years ago

We don't have a good level of comfort that the UTF8 encoding is working correctly. It's quite complex and, to be honest, I don't really understand UTF8 myself. Yunlu wrote the encoding but I'd suggest that Darshan could create some tests.

Sraw commented 4 years ago

We actually have a test for UTF8:

https://github.com/rchillyard/HuskySort/blob/c3f8ef5865b401f680de55018e8d10d1bcf86f5e/src/test/java/edu/neu/coe/huskySort/sort/huskySortUtils/HuskyCoderFactoryTest.java#L67

rchillyard commented 4 years ago

Yes, I know. But for such a complex method with so many cases, I don't think one test is really sufficient. Plus, I'd like to see more of an explanation of what the test does, what's unique, etc. :)

Sraw commented 4 years ago

I pushed a branch to work on this issue. The reason why there are just a few tests is that it is really difficult to manually encode UTF-8 longs. Well, if you think this is no enough, I should be able to work on some other cases such as German or maybe Japanese.

https://github.com/rchillyard/HuskySort/tree/addMoreUTF8Tests