pylint-bot / test

0 stars 0 forks source link

Eliminate nested sequences in node fields #266

Open pylint-bot opened 8 years ago

pylint-bot commented 8 years ago

Originally reported by: BitBucket: ceridwenv, GitHub: @ceridwen?


Some of the nodes currently contain fields that are doubly-nested sequences, usually tuples inside lists. For instance, look at Dict nodes:

>>> s = '{1: a, b: 2, 3: c}'
>>> d1 = astroid.parse(s).body[0].value
>>> print(d1)
Dict.dict(items=[ ( <Const.int l.1 at 0x7f25c8ecd198>,
              <Name.a l.1 at 0x7f25c8ecd160>),
            ( <Name.b l.1 at 0x7f25c8ecd208>,
              <Const.int l.1 at 0x7f25c8ecd1d0>),
            ( <Const.int l.1 at 0x7f25c8ecd278>,
              <Name.c l.1 at 0x7f25c8ecd240>)])

These doubly-nested sequences increase the complexity of handling for node fields because everywhere general node-processing code is required, there need to be two nested type checks and a factored-out function to handle the otherwise-identical code for the first and second levels of sequences. This is especially painful when generators are involved because on Python < 3.2, without yield from, there's no good way to factor code out of generators.

I would propose that node fields should be defined to be either individual nodes or sequences of nodes, nothing else. This will make the zipper code simpler and faster and help other places similar code is required, like #259. Nodes that currently contain nested sequences should either have the nested sequences split into multiple fields, like the stdlib AST module does for Dict,

>>> d0 = ast.parse(s).body[0].value
>>> print(ast.dump(d0))
Dict(keys=[Num(n=1), Name(id='b', ctx=Load()), Num(n=3)], values=[Name(id='a', ctx=Load()), Num(n=2), Name(id='c', ctx=Load())])

or have new subnodes defined that contain the individual nodes.


pylint-bot commented 8 years ago

Original comment by Claudiu Popa (BitBucket: PCManticore, GitHub: @PCManticore?):


Changing the Dict interface to be similar with Python's builtin ast module sounds like a good enhancement to me, there's no arguing against that.

What do you mean by nodes or sequences of nodes, nothing else? Two different nodes can have either nodes or sequences of nodes, depending on their type. Hope that I'm not understanding something else here.

pylint-bot commented 8 years ago

Original comment by BitBucket: ceridwenv, GitHub: @ceridwen?:


I mean specifically that for all the fields of a node that contain other nodes (everything currently labeled by _astroid_fields), they should contain either a direct reference to a node or a sequence of nodes, not any other possible Python object, not non-node objects other than sequences, not nested sequences, not dicts, etc.

pylint-bot commented 8 years ago

Original comment by Claudiu Popa (BitBucket: PCManticore, GitHub: @PCManticore?):


Sounds fair to me.