jwilk-archive / djvusmooth

graphical editor for DjVu
GNU General Public License v2.0
12 stars 2 forks source link

Merge words, add words, delete words #11

Open jwilk opened 10 years ago

jwilk commented 10 years ago

Issue reported by kempelen at Bitbucket:

Hi Jakub,

Very useful editing features could be implemented on the word tree part. I list with keys, because as I can see there is no right-click menu on those. (Corresponding Edit -> Text submenus would be needed. But those could be "main menu" in Edit menu to save some clicks, also the current sub-submenus could be simply directly in "Edit" with separator lines, instead being second level submenus, which is inconvenient for so few menus.)

Similar features could work on "line" and "para" - at least the delete.

And there could be a feature to make a box fit to contained items. E.g. I could make the line fit to contained words' min-max X/Y, and then the para and column too.

What do you think? I would really like to help with these, but I don't know Python and I don't understand the source code that I checked. :-(

Thanks, Ferenc

jwilk commented 10 years ago

Comment submitted by kempelen at Bitbucket:

Jakub, Meanwhile I've implemented a similar tool that includes these features. It's not as easy like djvusmooth because I made it as web app, it requires to export data to XML and images to PNG. But it already contains most of the above mentioned features and looks similar to DjVuSmooth. I'll let you know when it's ready if you are interested.

jwilk commented 10 years ago

Comment submitted by kempelen at Bitbucket:

Hi Jakub, my editor that can do the features listed above is here: http://sourceforge.net/projects/webdjvutexted/

jwilk commented 10 years ago

Comment submitted by @jsbien:

I think your editor would have much more users if it used hOCR instead of DjVu-specific XML (I hope you are familiar with Jakub Wilk's DjVu hOCR utilities).

jwilk commented 10 years ago

Comment submitted by kempelen at Bitbucket:

Hi Janusz, no, I didn't know djvu2hocr and hocr2djvu, thank you! hOCR format looks too complicated (or better said too loosely, too freely defined!), than the very strict DjVu XML, so does the hOCR output from Jakub's tools.

DjVu XML format has strict structure, and the JavaScript tree editor used in my program (jstree.com) allows to define a rigid structure that the user cannot break, so these things work perfect together. I don't really see a chance to support a much more flexible structure like HTML+hOCR. :-( If someone wants to create DjVu as final output, hOCR is not a useful step, unless he plans to keep the more advanced markup, headings, tables, etc for other purposes - that DjVu cannot store.

Thank you, Ferenc

jwilk commented 10 years ago

Comment submitted by @jsbien:

You are right that the format is loosely defined, so we treat the output of djvu2hocr as the reference :-)

hOCR is output in particular by ocrodjvu, so the workflow would be straightforward, but I understand your reasons.

No time yet to test your program, but I will do it in a week or so.

Regards

Janusz

jwilk commented 10 years ago

Unfortunately, I don't have time to implement any new features in djvusmooth. Code contributions from other people are of course welcome.