yesworkflow-org / yw-prototypes

Research prototype with tutorial. Start here to learn about and try YesWorkflow.
http://yesworkflow.org/wiki
Other
33 stars 13 forks source link

How to format @begin text to render nicely in graphviz et.al.? #35

Open olyerickson opened 8 years ago

olyerickson commented 8 years ago

As we come up to speed, we're trying to develop best practices for naming code blocks. We'd like them to be useful one-sentence descriptions of what's happening in a code block.

I've noticed that I can hack the .gv files to remove underscores from block names and to insert newlines, thus making the graph image look better (a key to this working is having the names surrounded by quotes in the gv file).

Any thoughts on how to do this directly from the YW comments? selection_542

olyerickson commented 8 years ago

Edited title to better reflect my question

ludaesch commented 8 years ago

How about a construct like so:

@begin BlockName @desc A lengthy description that can spread lines

On Thu, Dec 17, 2015 at 8:11 AM, John S. Erickson, Ph.D. < notifications@github.com> wrote:

As we come up to speed, we're trying to develop best practices for naming code blocks. We'd like them to be useful one-sentence descriptions of what's happening in a code block.

I've noticed that I can hack the .gv files to remove underscores from block names and to insert newlines, thus making the graph image look better (a key to this working is having the names surrounded by quotes in the gv file).

Any thoughts on how to do this in the YW comments? [image: selection_542] https://cloud.githubusercontent.com/assets/358649/11871530/2cc624c8-a49e-11e5-81b2-14d03ab6262e.png

— Reply to this email directly or view it on GitHub https://github.com/yesworkflow-org/yw-prototypes/issues/35.

olyerickson commented 8 years ago

That is certainly more sensible! We could then more naturally map the @begin value onto some title-like property and the @desc onto a description-like property.

While you're at it, add @type ;) ;)

tmcphillips commented 8 years ago

Another advantage of distinguishing the name of code block from its description is that the former can be referred to in prospective and retrospective provenance queries, whereas the latter might be a little unwieldy for this purpose.

Currently any text between the first token following a @begin, @in, @out, or @param keyword and the next YW keyword or the end of the comment, is extracted and stored as a "description" in the internal data model, and is exported to the extract facts. So you can do this:

@begin BlockName A somewhat lengthy description that must fit on one line

and (assuming this is the first annotation) you'll find this fact exported:

annotation_description(1, 'A somewhat lengthy description that must fit on one line').

Note that no quotes are needed around the description.

There are two catches, however. The first is that you can't see this description anywhere else! We could easily make the description for code blocks optionally appear within the corresponding boxes in the process and combined views, either with or without the block name (using a horizontal dividing line between the two if both are displayed).

The second catch is that I think we need to deprecate this syntax in favor of the explicit @desc annotation that Bertram suggests so that we can support multiple (non-description) arguments to @in and @out, e.g.

@in x y z

instead of requiring

@in x
@in y 
@in z

which tends to overwhelm the script with YW annotations.

John, Bertram, would you like to see the graph feature that allows one to display the description first, and support for an explicit @desc keyword second? Or vice versa? Descriptions added before the @desc keyword is supported might need to be updated to include it once the keyword is added.

olyerickson commented 8 years ago

I'm not sure I completely grok the question, BUT personally I would favor @desc.

Note that for us the rendered graph is not the priority; really, we're interested in getting this info into a knowledge graph in a high-fidelity way. So whatever gets it into the model, so we can do a set of queries (for example) that let us produce RDF/linked data, is really our priority.

ludaesch commented 8 years ago

I like the shorthand @in a b c

The current implicit description is nice too, but I think the @descr would make things less surprising. So slight preference to require the @descr

Then I would have the description below the BlockName with a horizontal line by default (only if there is a description of course -- and @descr would be optional).

Then, as part of some config parameter, one could suppress the display of descriptions. For querying purposes, the description text would be accessible as a property of a Block node. So one could write queries that peek into a description (say with substring or regex match) -- the future :)

On Thu, Dec 17, 2015 at 3:57 PM, Timothy McPhillips < notifications@github.com> wrote:

Another advantage of distinguishing the name of code block from its description is that the former can be referred to in prospective and retrospective provenance queries, whereas the latter might be a little unwieldy for this purpose.

Currently any text between the first token following a @begin, @in, @out, or @param keyword and the next YW keyword or the end of the comment, is extracted and stored as a "description" in the internal data model, and is exported to the extract facts. So you can do this:

@begin BlockName A somewhat lengthy description that must fit on one line

and (assuming this is the first annotation) you'll find this fact exported:

annotation_description(1, 'A somewhat lengthy description that must fit on one line').

Note that no quotes are needed around the description.

There are two catches, however. The first is that you can't see this description anywhere else! We could easily make the description for code blocks optionally appear within the corresponding boxes in the process and combined views, either with or without the block name (using a horizontal dividing line between the two if both are displayed).

The second catch is that I think we need to deprecate this syntax in favor of the explicit @desc annotation that Bertram suggests so that we can support multiple (non-description) arguments to @in and @out, e.g.

@in x y z

instead of requiring

@in x @in y @in z

which tends to overwhelm the script with YW annotations.

John, Bertram, would you like to see the graph feature that allows one to display the description first, and support for an explicit @desc keyword second? Or vice versa? Descriptions added before the @desc keyword is supported might need to be updated to include it once the keyword is added.

— Reply to this email directly or view it on GitHub https://github.com/yesworkflow-org/yw-prototypes/issues/35#issuecomment-165591904 .

tmcphillips commented 8 years ago

The @desc annotation is now available for qualifying code blocks and ports. There are examples in simulate_data_collection.py script, and the model queries mq2, mq5, and mq6 display these descriptions.

Note that quotes are not needed around the description text, and descriptions end at the end of the line or comment, whichever comes first.

tmcphillips commented 8 years ago

Process and combined graph views now can display code block descriptions of in the "program" nodes. The default value for the new graph.programlabel property is both. Alternative values for the property are name and description. See combined view of simulate_data_collection.py.

olyerickson commented 8 years ago

Thanks!

I've verified this with one of our scripts, and it works.

John

On Mon, Dec 21, 2015 at 2:53 AM, Timothy McPhillips < notifications@github.com> wrote:

Process and combined graph views now can display code block descriptions of in the "program" nodes. The default value for the new graph.programlabel property is both. Alternative values for the property are name and description. See combined view https://github.com/yesworkflow-org/yw-prototypes/blob/master/src/main/resources/examples/simulate_data_collection/yw/combined.pdf of simulate_data_collection.py.

— Reply to this email directly or view it on GitHub https://github.com/yesworkflow-org/yw-prototypes/issues/35#issuecomment-166227025 .

John S. Erickson, Ph.D. Director of Operations, The Rensselaer IDEA Deputy Director, Web Science Research Center (RPI) http://tw.rpi.edu olyerickson@gmail.com Twitter & Skype: olyerickson