edicl / cl-unicode

Portable Unicode library for Common Lisp
https://edicl.github.io/cl-unicode/
61 stars 24 forks source link

Integrating cl-unicode with my Google Summer of Code work #7

Open krzysz00 opened 10 years ago

krzysz00 commented 10 years ago

I'm currently working on improving SBCL's Unicode support as part of the Google Summer of Code. Part of my work, which you can find here, involved expanding the SBCL Unicode database and exposing its contents through external APIs, which are in the sb-unicode package and defined in target-unicode.lisp.

Many of the functions that my work provides overlap with those provided by cl-unicode. I am writing to you to discuss whether cl-unicode could become a (partial) wrapper around the SBCL-provided Unicode database when compiled on SBCL to avoid data duplication. I'm willing to add additional data to the internal database, such as block information, to accomplish this goal if you think it's a good idea. One possible issue is that my work stores full case mappings, while yours stores simple mappings.

If you have any other suggestions about how my work could integrate with your library, please let me know.

hanshuebner commented 10 years ago

From a maintainer's perspective, I would like to see cl-unicode dispatch to SBCL internal functions when they're available. Obviously, it is important that the semantics of the two implementations match, but given the narrow nature of cl-unicode's functionality, this should not be too hard to achieve.

One area where semantic differences have prevented such merges with other libraries in the past is error reporting and performance requirements. For example, flexi-streams is relatively slow, but handles errors gracefully and attempts to deal with encoding problems in a manner that allows restarts. This made merging flexi-streams with implementation-supplied encoding functionality undesireable, as implementations often have a focus on performance and thus don't report errors in the same precise way.

krzysz00 commented 10 years ago

Since my changes aren't merged into SBCL's master yet, the API can still be adjusted. Here is an SBCL manual that contains documentation for my API in section 7.8. Please let me know if there's any additional functions you'd like to see in the sb-unicode package. There is an internal function proplist-p, not mentioned in the documentation, that might help with has-binary-property (see tools-for-build/ucd.lisp to get a list of which properties we support or could support).

hanshuebner commented 10 years ago

I will not be able to spend time on looking at cl-unicode or your code to give advice, so I need to trust you to figure out how your proposed integration can be made to work. Sorry.

2014-07-20 16:19 GMT+02:00 Krzysztof Drewniak notifications@github.com:

Since my changes aren't merged into SBCL's master yet, the API can still be adjusted. Here https://www.dropbox.com/s/n5qn4v8p13zg4dg/sbcl-manual-unicode-algorithms-for-github.pdf is an SBCL manual that contains documentation for my API in section 7.8. Please let me know if there's any additional functions you'd like to see in the sb-unicode package. There is an internal function proplist-p, not mentioned in the documentation, that might help with has-binary-property (see tools-for-build/ucd.lisp to get a list of which properties we support or could support).

Reply to this email directly or view it on GitHub https://github.com/edicl/cl-unicode/issues/7#issuecomment-49547580.

krzysz00 commented 10 years ago

I'll look through the cl-unicode api and try to add any functionality that might be needed to make integration easier.

krzysz00 commented 10 years ago

As far as I know, my unicode-algorithms branch now puts all the data that cl-unicode uses in the SBCL database.