dev-cafe / parselglossy

Generic input parsing library, speaking in tongues.
https://parselglossy.readthedocs.io
MIT License
7 stars 2 forks source link

Reorder template according to dependencies between default actions #99

Closed robertodr closed 4 years ago

robertodr commented 4 years ago

Fixes #76, that is the "nondegenerate" case of #96 The reordering is done after checking that the template is valid: I assume that there are no cyclic dependencies at this point. I also use the fact that dependencies can only appear in lists of keywords, not between sections. The algorithm looks like this:

  1. Generate DiGraph of dependencies for defaults for the list of keywords within the top-level section. I have added the plumbing function view_by_default_keywords to get a view of defaults only for the keywords (i.e. non-recursive) This is a copy-paste of the work of @bast to detect cycles.
  2. Iterate over nodes in the DiGraph and check whether a keyword in the list of keywords has the same name as a node. If yes, deep copy them to a shuffle list and remove them from the original list. This effectively splits the list of keywords for the top-level section into two parts: "spectator" keywords, i.e. those that are independent of other keywords in the list, and "dependent" keywords (in the shuffle list). Once we're done we can append the shuffle list to the "spectator" list. Given that we loop over nodes in the DiGraph the shuffle list is already in the desired order.
  3. Recursion over sections in the template.

Step 2 is a bit wasteful as we loop over all keywords for all nodes in the DiGraph so it's quadratic in the number of keywords in the worst case. I will not waste time optimizing this as I think it's not a problem in the slightest.

Other changes in this PR not related to the topic

codecov[bot] commented 4 years ago

Codecov Report

Merging #99 into master will decrease coverage by 0.21%. The diff coverage is 91.80%.

bast commented 4 years ago

Thanks a lot for this! Looking at it ...

bast commented 4 years ago
  1. Iterate over nodes in the DiGraph and check whether a keyword in the list of keywords has the same name as a node. If yes, deep copy them to a shuffle list and remove them from the original list. This effectively splits the list of keywords for the top-level section into two parts: "spectator" keywords, i.e. those that are independent of other keywords in the list, and "dependent" keywords (in the shuffle list). Once we're done we can append the shuffle list to the "spectator" list. Given that we loop over nodes in the DiGraph the shuffle list is already in the desired order.

I was expecting that we would solve this with some of the sort functions from networkx now that we know that the graph is sortable. But this is rather solved by deepcopy. I am still trying to wrap my head around it: how do we make sure that the deepcopies are processed in the right order? Looking at the tests it seems to work though so it's probably correct.

robertodr commented 4 years ago
  1. Iterate over nodes in the DiGraph and check whether a keyword in the list of keywords has the same name as a node. If yes, deep copy them to a shuffle list and remove them from the original list. This effectively splits the list of keywords for the top-level section into two parts: "spectator" keywords, i.e. those that are independent of other keywords in the list, and "dependent" keywords (in the shuffle list). Once we're done we can append the shuffle list to the "spectator" list. Given that we loop over nodes in the DiGraph the shuffle list is already in the desired order.

I was expecting that we would solve this with some of the sort functions from networkx now that we know that the graph is sortable. But this is rather solved by deepcopy. I am still trying to wrap my head around it: how do we make sure that the deepcopies are processed in the right order? Looking at the tests it seems to work though so it's probably correct.

Yes, indeed. It also took me some time to realize it would work out of the box. The DiGraph is already sorted because of the way we create it, however the nodes appear in reverse order. In the test example I have three keywords: flavor, color (depends on charge), and charge (depends on flavor). Upon creation the nodes in the graph are color, charge, flavor. I just reverse and loop over them. Any keyword whose name not in the list of nodes we keep in the template (a "spectator" keyword), the other we deepcopy to the shuffle list and remove from the template. We append them to the list of keywords in the template when we're done looping over nodes. So the trick is that the nodes are ordered already. I did try nx.algorithms.dag.topological_sort on our DiGraph but, at least on my test sample, this was basically a no-op.