ufal / treex

Treex NLP framework
33 stars 6 forks source link

ViewJSON - trees visualization for kontext #32

Closed m1ch4ls closed 8 years ago

martinpopel commented 8 years ago

lib/Treex/Block/ViewJson/Node.pm and lib/Treex/Block/ViewJson/TreeLayout.pm are not blocks (they are not derived from Treex::Core::Block, they do not implement any process_* method), so they should not be in the lib/Treex/Block/ directory. I would appreciate any comments on this commit (what is its intend).

m1ch4ls commented 8 years ago

This is a work in progress with me and @natalink

We want to take visualization from https://github.com/ufal/treex-web and make it available for Kontext as well. The way it's done is converting Treex Document to JSON and composing trees in the browser using Javascript and SVG.

I am not sure whether we want to have the blocks responsible for conversion in main treex repo. That's why I asked @natalink to put it to separate branch so I can look at it. We may end up creating separate project or putting this to https://github.com/ufal/js-treex-view

Anyway where would you put ViewJson/Node and ViewJson/TreeLayout - they are just auxiliary wrappers?

martinpopel commented 8 years ago

Thanks for the answer and I am looking forward to the results. Separate branch in this repo is OK, I just saw this Pull Request (and have not noticed the WIP in title) and thought that you really want to merge it to master.

where would you put ViewJson/Node and ViewJson/TreeLayout

Treex::Tool::ViewJson seems OK. If you would like to integrate it to https://metacpan.org/release/Treex-Core (I would need to see ViewJson in action before judging whether it is a good idea), we could find some namespace in Treex::Core.

Treex::Block::Write::ViewJSON can stay in this location, that's a real block.

m1ch4ls commented 8 years ago

(and have not noticed the WIP in title)

I have put it there after you've asked the question. I can see how without it it's confusing...

m1ch4ls commented 8 years ago

I have rewrote the block.

# Install dependency
cpanm -n Treex::View

@martinpopel Can you please review? I have made a separate package Treex::View which handles the conversion to JSON. It also introduces little utility command view-treex treex_file.treex that can be used to display treex files - it doesn't require TrEd. Should be much easier to install than TrEd, but it has only basic features.

@natalink I think this block is finally usable. For Kontext you will have to limit number of bundles per document to one so you have a one file per sentence.

treex Read::Treex from=test.treex.gz bundles_per_doc=1 Write::ViewJSON to=.

See the parameter bundles_per_doc. Although I don't know how to tune the output filenames to have them named by bundle ID. I'd like to have something like:

filename-xyz.treex.gz -> split one bundle per doc -> filename-xyz/{$bundle->id}.json

I haven't figured out how to do it, yet.

martinpopel commented 8 years ago

how to tune the output filenames to have them named by bundle ID

A simple solution is to use the parameter substitute (and rely on the fact that bundles_per_doc adds three-digits numbers to document names, so we can just reuse it):

Write::ViewJSON to=. substitute='{(\d\d\d)\.}{/$1.}'

The same can be achived with a more flexible solution:

Util::Eval doc='$.set_file_stem($.file_stem."/")' Write::ViewJSON to=.

which can be easily adapted to change also $.file_number and $.file_path.

martinpopel commented 8 years ago

Can you please review?

Nice. Can you add to POD: NAME and DESCRIPTION which would briefly explain the purpose of this block (and mention that Treex::View needs to be installed first - I know it is in the error message, but it should be documented as well).

m1ch4ls commented 8 years ago

Comments finally added - I think it could be merged.

m1ch4ls commented 8 years ago

Sorry, closed by accident :angry: