Language-based speedups, e.g. rust json / yaml / csv parsing. Perhaps the whole core can be a rust-based package with language interconnections
UNIHAN can be made even more accessible to the masses - I am the one that can make happen, but it would take time and above all: Funding. This would need to be my 100% focus of my free time outside of work for months, or even longer.
P.S. I am out of contact with anyone from UNIHAN, is someone else already on the same effort as me? Can this effort be shared in any way?
This project can do much more to unlock the breadth and depth of UNIHAN:
Correctness
Digging deeper into the Database design, more needs to be done to ensure extraction and interlation are provided in a structured and detailed way.
Potentially
Perhaps https://www.unicode.org/reports/tr38/ can be crawled and used to verify correctness, and to an extent, in the future, we can generate.
UNIHAN can be made even more accessible to the masses - I am the one that can make happen, but it would take time and above all: Funding. This would need to be my 100% focus of my free time outside of work for months, or even longer.