tsroten / dragonmasher

Dragon Masher provides access to Chinese word/character data.
BSD 2-Clause "Simplified" License
3 stars 3 forks source link

Overlapping / similar projects #1

Open tony opened 7 years ago

tony commented 7 years ago

nice to meet you @tsroten, my name is Tony Narlock and have come to find and admire your CJK libraries. I have a similar project at https://cihai.git-pull.com. It's still in early phases

Goal is to be a successor to cjklib

Some notes on the approach I'm taking / state of dealing with data:

tsroten commented 7 years ago

@tony Thank you for the kind words 😄 It looks like you're making some good progress. Many of the design choices you've made so far are much better than what I was doing with Dragon Masher a few years ago.

By the way, I enjoyed The Tao of tmux, thanks for writing it.

I've pretty much scrapped Dragon Masher due to time constraints and my interests having shifted. Most of my time these days is spent on mycli. I'm doing my best to continue to support Zhon, Dragon Mapper, and PyNLPIR. If changes to any of those projects will help you with cihai, let me know.

tony commented 7 years ago

The time investment in getting these libraries right is more than I ever anticipated.

I think you're a better programmer than me.

Consider forming an organization for your CJK utilities. I think it's the right track to keep them in independent projects, permissively licensed, as you have them.

The labor effort is intensive. No one will always be around to support these libraries. For instance, cjklib and and libUnihan have gone quite a while since the last update. My hope is to pull people together in a single organization to (hopefully) inspire people to upkeep things. Most of the point of the CJK libraries, other than being nifty utilities, is the sheer sake of conservancy. Infrastructure should be ready for the open source community to fill in responsibilities. And also, not wanting to feel time invested was gone to waste.

I can also pitch you bringing projects to cihai, becoming a member, and then making an open call in README's / reddit / HN for more members to share the burden of housekeeping it. This is the only way I think things will scale in the long term. So please let me know if you're interested in any of that. Joining cihai org as an member (just yourself), moving projects to that org + becoming a member, or waiting and seeing what happens. Anything is helpful at this point. I want all python CJK projects to be open as possible. It's an open offer and I encourage anyone who reads this with CJK libs to join cihai.

unihan-etl was well-received on the Unicode mailing list. cihai is going to have a case study written about it for Open Knowledge International in the coming month or two.

The laundry list of projects on my bucket list makes me ache. cihai is one of the roadblockers. Namely due to a rewrite of http://wengu.tartarie.com/ in sphinx-doc. The next thing up is unihan-db, devising a sensible format for storing and querying Unihan's data/variants.