cylc / cylc-flow

Cylc: a workflow engine for cycling systems.
https://cylc.github.io
GNU General Public License v3.0
330 stars 93 forks source link

parsec library: improvements & extensions #2920

Open sadielbartholomew opened 5 years ago

sadielbartholomew commented 5 years ago

Master listing recording all improvements & extensions intended to be made, eventually, to the parsec library (cylc/lib/parsec/).

Originally moved from plain-text notes in the codebase in #2913. Please update in line with changes & extend with ideas.

To do:

sadielbartholomew commented 5 years ago

@kinow: I believe in the PR #2839 you addressed a number of the items on this checklist, as tentatively noted above. Can you please confirm if this is correct? I looked through the PR code changes but since I am not very familiar with parsec it was not immediately clear what broad test cases were added.

Indeed, if there are any other points you know have been addressed already or are no longer applicable, please update the list. Thank you.

kinow commented 5 years ago

@sadielbartholomew I think the points you marked as done are indeed right. Added the tests mainly to learn parsec, and also preparing for the move to Python3.

I also created #2880, which is related to #2775, and consists in basically moving the code under lib/cylc/parsec. Not sure if necessary to be under this list. Other issues related to parsec too: https://github.com/cylc/cylc/issues?utf8=%E2%9C%93&q=is%3Aissue+is%3Aopen+parsec

sadielbartholomew commented 5 years ago

Thanks for clarifying @kinow.

I am aware of #2880, but since it describes a clear-cut objective which concerns parsec as a whole, I think it should definitely stay as its own Issue. I wouldn't count it as an "improvement" (at least to parsec itself) or extension as such.

I've yet to have a proper look at other open Issues concerning parsec. There may be some that would be appropriate to re-home under to this list (closing the original but cross-linking back to it so that any comments there would still be viewable); feel free to consider this & move issues in this way yourself, by editing the comment. It's not worth agonising about, though. The main drive for creation of this Issue was to move text files giving 'to do' work for parsec from the codebase itself into the issue tracker.

matthewrmshin commented 5 years ago

While we are into parsec and its future, we may also want to consider:

hjoliver commented 5 years ago

Python API - how much configuration file functionality do we still need in a future where users will be using more Python and less configuration file?

I'm keen for a Python API myself at this point, but I still think - unfortunately - many (most?) of our users are not sufficiently expert at programming to write suites as programs.

So my feeling is, we draw a line in the sand and continue support existing functionality via config file, but if you want certain advanced functionalities, or have very complex workflows, you have to be (or become) sufficiently expert at Python.

Also, I'd really like to modernize to YAML, say, but we can't realistically ditch the existing file format can we? That would cause a lot of trouble. And how much work to support multiple config file formats and a Python API?? (As a matter of fact, I'd personally be happy to say "to use cylc-9 (say) you must convert to YAML or Python" ... but I don't see all of us agreeing with that :grimacing: !)

hjoliver commented 5 years ago

(oops, unintentional close!)

hjoliver commented 5 years ago

I have crossed out the Ordered Dict item above. That was about OrderedDict being less efficient than plain dict in Python 2. I believe in Python 3, plain dict is now ordered, and there's no performance hit.

hjoliver commented 5 years ago

We might still want to modify the home-built "ordered dict with defaults" code in Cylc.

matthewrmshin commented 5 years ago

I agree (and I am definitely not suggesting that we remove support for our current configuration format any time soon).

I would say that we have 2 main issues with our current file format:

I cannot see a simple path to fully move away from our current configuration format either. Jinja2 preprocessing makes it almost impossible to automate the process. Otherwise, we should be able to read the data structure that represents the current configuration file, and re-dump it as whatever configuration file format(s) that our future selves prefer.

However, we should still keep our options open - as the technology world moves fast.

kinow commented 5 years ago

Was reading RSS feed today and this post about YAML appeared with some cases where YAML can be unsafe or with undesired parse results. More discussion on this Hacker News thread.

hjoliver commented 5 years ago

More discussion on this Hacker News thread.

... the true horror starts when people start using text template engines to generate YAML.

Uh-oh, sounds familiar :rofl: