Open Psycojoker opened 10 years ago
A pair is a good idea. I propose a structure making it explicit:
"split": [(" \", "\n"), (" \", "\r\n"), (" ", "")]
I'm not quite satisfied with it but it allows easy iteration:
Number of lines (= newlines + 1)?
len(node.split)
Iterating only on values (no newlines):
[p[0] for p in node.split]
Iterating on everything:
[v for p in node.split for v in p]
I was thinking about using python slice advanced notation in my solution, like this:
In [6]: [" \\", "\n", " \\", "\r\n", " "][::2]
Out[6]: [' \\', ' \\', ' '] # spaces
In [7]: [" \\", "\n", " \\", "\r\n", " "][1::2]
Out[7]: ['\n', '\r\n'] # cariage return
I'm thinking about it, but I don't think that 2 newlines operators can follow each other.
Number of lines (= newlines + 1)?
More like that I think:
In [8]: len([" \\", "\n", " \\", "\r\n", " "][1::2])
Out[8]: 2
Indeed! Let's stick to the easier data structure.
In its current form, the space token might contains a "\n", since the platform dependant line return might be a problem, it would be good to add a new key containing a splited version of the string.
Like:
A coherant structure would be good, like the fact that carriage return are always in pair positions of the splited list (thus needed an empty string if it stars with a carriage return).