skishore / makemeahanzi

Free, open-source Chinese character data
https://www.skishore.me/makemeahanzi/
Other
1.81k stars 465 forks source link

Wikimedia Commons Stroke Order project want to use your data #25

Open hugolpz opened 6 years ago

hugolpz commented 6 years ago

Onboarding Wikimedia Commons members.

Hello there, I come from the Stroke Order project and Ancient Chinese Characters project on Wikimedia Commons, I just discovered and really love your project and data, which has a much larger reach that our Stroke Order project.

The SO project was created by graphic designers and so has strong UX / design expertise, we discussed a lot about how to display stroke order efficiently for teaching purposes. We came up with few elegant styles, naming conventions and file formats (300px) : -sbs.gif -red.png screenshot from 2018-01-19 14-19-52

I would like to work on your open data via nodejs to output png and gif images satisfying Wikimedia projects and our graphic guidelines.

Also, could I ask you for guidance while I explore your work ?

Personal notes:

parsimonhi commented 6 years ago

Hello,

Here is a demo that generates png images in your 'red' style from svg similar to those of makemeahanzi (the animCJK project is derived from makemeahanzi) : http://gooo.free.fr/animCJK/official/red.php

Hope it helps.

hugolpz commented 6 years ago

Thanks Parsimonhi, I will dig into this toolchain's licenses in order to see if i can use these files on Wikimedia Commons, preferably under CC-BY-SA or Public Domain license, as the SO project do. Basically, hanzi shapes are PD domain. But Arphic data and extracted XML data is under Arphic open license status, as are the svg files produced from it. As for the red images from your demo, the reds are selected by the demo's software so the author of the software holds copurights. I have to understand more about this side's licensing.

Personal memo : https://commons.m.wikimedia.org/wiki/Special:MyLanguage/Commons:Copyright_tags

parsimonhi commented 6 years ago

@hugolpz If you can use Arphic data and makemeahanzi graphic data, you can use animCJK data (the 'red' demo can be easily rewritten using only basic javascript code).

skishore commented 6 years ago

Hi Hugo, welcome! I'm familiar with the Stroke Order Project - I looked at that project (and in particular the methods used to create that data) when I started this one. I saw that the data is very high quality, but the approach took a lot of manual work.

Incorporating this dataset into the Stroke Order Project sounds like a good idea, but as you've noted, the main stumbling block is the license mismatch. What are your plans for that?

hugolpz commented 6 years ago

As for the various coloring or svg document's design, I'am junior but still can write nodejs scripts myself so I'am sure of the copyrights for the documents design are mine and open license for sure. I would also acquire autonomy.

As for Arphic data :

  1. I could convert the svgs I produce into PNG via imagemagick, so the XML data is not present. When doing so, the shape present in the png is just a common sensical representation of Chinese characters in kaishu style and is PD license.
  2. I could alter via hand-edits the path so my svg data varies from Arphic. But I'am not super hot for that.

But first I need to dig into these licenses and their compatibility with my need.

parsimonhi commented 6 years ago

@hugolpz I don't think that you can change the data format (i.e. convert the svgs into png) without mentioning from where you got the data. This is explicitly mentioned in the Arphic license (see https://github.com/skishore/makemeahanzi/tree/master/APL/english).

However, it seems that Wikimedia Commons accepts "freely licensed" works as mentioned at https://commons.m.wikimedia.org/wiki/Commons:Licensing, and according to https://freedomdefined.org/Definition, it is a "permissible restriction" to have to mention in your files where data come from (i.e. from Arphic data, makemeahanzi data or animCJK data).

hugolpz commented 6 years ago

@parsimonhi

tl;dr: SO project do not wish to share the Arphic font file nor associated xml data, both under Arphic License. Prints of fonts are subject to less copyrights, especially when shape is PD due to history. Attaching and require viral conservation of Arphic License and MakeMeAHanzi LGLP license to each image is not practical for our end users. As of 2018, the SO project want to share raster prints of the fonts, augmented by colors. Prints are not subject to the font file's license. For each raster image, PD license wanted. AnimCJK will not be used. Thanksful citation of both Arphic and MakeMeAHanzi makes sense as fair. I will have to ask Commons users with more expertise in Copyright laws and fonts. I'am also looking for alternative PD data of lower scope, ~1000 frequent characters images will do.

parsimonhi commented 6 years ago

@hugolpz The issue looks very complicated and I am not an expert anyway.

If you need to use some parts of animCJK project in the futur (especially if you have some interest for Japanese characters which are not part of makemeahanzi), there is no problem.

I hope you will succeed to achieve your project.

hugolpz commented 6 years ago

Correction: from a recent talk with Skishore #1, it appears it would be best to colloborate with @parsimonhi's animCJK since you have a dedicated focus on multi-polities support (PRC, ROC, Japan, Korea, etc.).

parsimonhi commented 6 years ago

Sample of animated gif for Japanese version of 馬 using animCJK and LICEcap (https://www.cockos.com/licecap/). Same process can be done with Makemeahanzi. 39340 Once all is in place, it takes less than one minute to record and save a character animation as an animated gif.

hugolpz commented 6 years ago

:sob: :100: Which tech do you use to create your animations ? (Note: I'am rigth now looking into your repos) If nodejs I can come and hack it further for minor things, like adding a stroke:2px solid black; or stroke="black" stroke-width="2"

parsimonhi commented 6 years ago

It is not automatic at the moment. Below is what I did:

1) I opened animCJK in a browser using the following URL: http://gooo.free.fr/animCJK/all.php?fs=302&dc=black then I entered 馬 in the data field and I clicked on the "Animate" button. 2) I opened LICEcap, and put its window over the square where characters are drawn. Then I clicks on the "Record" button of LICEcap, I set the file name to 39340.gif, I checked the "Automatic stop after" checkbox, and set the time to 12s. Then I click on the "Save" button. 3) I quickly hit the "Animate" button of animCJK. 12 seconds after, the gif is recorded and saved.

I think that you can automate the process that generates the animated gif using LICEcap with a tool that simulates user actions. It depends of the OS you are using. However, I didn't do that at the moment.

I have no idea if you can monitor this using nodejs + some magic tools.

hugolpz commented 6 years ago

This is out of my reach as of now. Already too many projects in my life. I will focus on clarifying the licenses and attacking the -red.png via nodejs, that I know a bit.

parsimonhi commented 5 years ago

Finally, I dit it (pure javascript+HTML, nothing else): http://gooo.free.fr/animCJK/official/samples/imageFactory.html Note 1: the svgs folder of the demo online contains only the MakeMeAHanzi svg for the characters 鼠貓牛虎兔龍蛇馬羊猴雞狗 Note 2: the same code can generate images for any characters of animCJK project or MakeMeAHanzi project. Just put the svg files you need in the svgs folder (see below)

To make a demo on your own website that use all the svgs of MakeMeAHanzi project:

1) create a folder named "imageFactory" (any other name is ok)

2) create a subfolder of your "imageFactory" folder called "samples"

3) get "imageFactory.html" file from animCJK project (https://github.com/parsimonhi/animCJK/tree/master/samples), put it in your "samples" folder.

4) create a subfolder of your "samples" folder called "_js"

5) get "Animated_GIF.js" and "magicAcjk.js" from animCJK project (https://github.com/parsimonhi/animCJK/tree/master/samples/_js), put them in your "_js" folder. Note that "Animated_GIF.js" can also be downloaded from Animated_GIF project (https://github.com/sole/Animated_GIF/dist). Note that Animated_GIF project itself is derived from several other projects. Update 2018/11/05: get also "brushingAcjk.js" from animCJK project.

6) get the "svgs" folder from MakeMeAHanzi project (https://github.com/skishore/makemeahanzi), put it in your "imageFactory" folder, or if you already put the "svgs" folder elsewhere on your website, replace "../svgs/" in "imageFactory.html" file by a relative path to your "svgs" folder.

You can also use the svg of animCJK that are in its "svgsJa" and "svgsZhHans" folders (https://github.com/parsimonhi/animCJK/tree/master), but here, it's the MakeMeAHanzi project blog! ;-)

Pending issue: no transparency for the animated gif images.

Any comments are welcome.

Have fun!

hugolpz commented 5 years ago

1) @parsimonhi : are you French........................................................... ? About 80% of CJK opensources is made by French dudes. Kanjivg, Wikimedia CJK project, AnkiDroid, Chinese audios via LinguaLibre.org for HSK (ex "Shtooka Recorder") and Tatoeba.org for sentenses were initiated by French nerds. For some reasons French love this field.

2) Language: If I get it right, your demo *-red.png-like and *-sbs.gif-like are generated via js right ? I matters because I can hack js well while I have zero php experience. Minor edits may help, such as for file names :D Or I may provide you with naming conventions and guidelines for code / design improvement request ?

EDIT:

3) Ok, below is my review with Wikimedia Commons CJK guidelines in mind :

4) License : I encourage you to publish under CC-0 {your name or pseudo}. It's a Public domain, yes. Wikimedia CJK Stroke Order project contributors including myself decided to contribute under such license on the basis that Chinese character and calligraphy are themselves Public Domain and part of the intangible cultural heritage. With CC-0 / Public Domain license, these files can be publish on Commons, and link back to your project and/or author page can still be indicated out of fairness. I would recommend you to use and add such license on your page.

parsimonhi commented 5 years ago

1) I am french! 2) I have now scripts that generate all the images automatically (need more than one hour to generate all the images: i am trying to reduce this time). Can you please open an issue on https://github.com/parsimonhi/animCJK/issues in order to discuss about how they work, how they can be adapted if necessary, and what storage rules have to be applied? 3) According to me, new strokes have to be printed UPON of course. Never see a calligrapher writing new strokes under previous strokes! I noticed that some (but not all) "red" images of Wikimedia commons have the new strokes writing under the previous ones. This is at least disturbing and probably an error. Il you want red strokes under, it should be better to use red for the first stroke to black for the last stroke. 4) Ok for a CC license. As for any image that contains some texts, it makes sense that the license concerning the font used for drawing texts is not "transmitted" to the license of the image. Update 2018/11/08: finally, after reading many documents about font licensing, I thinks that distributing images derivated from Arphic fonts under a CC license is questionable. Arphic license seems required.

hugolpz commented 3 years ago

EDIT : holy f%$# of awesomeness ! I'am nerd-crying 😭 How did I miss that ? Important: On commons we tested transparent background gif but it was not doing well, maybe due to imagemagick, so your image factory is finished.@parsimonhi I'am soooo sooorrrrryyy !!!!!!! 😭 😭 😭 😭

I'am resuming efforts on this side and I now have a wikibot ! : )

hugolpz commented 3 years ago

Edit. @parsimonhi ❤️

hugolpz commented 3 years ago

Wikimedia Commons Template:PD-font Screenshot_2021-03-29-16-35-13-224