Open bleucitron opened 8 years ago
@iOiurson I don't know if this is still important to you...
I have had a look through this today trying to get to the bottom of #139 and found the only reason for the openpyxl tokenizer in koala is because we are using the openpyxl Translator. Both the Translator and tokenizer were copies of openpyxl rather than importing openpyxl. Through this we have missed out on a lot of updates to openpyxl.
My guess is that historically it made sense to have our own tokenizer. Maybe there is potential to wrap the pyopenxl one and get rid of a heap of code that isn't being maintained (ergo the problems with multi-line formulae #139).
@vallettea What are your thoughts?
For completeness: I have removed the koala copy of the openpyxl code making openpyxl a dependency instead. Koala works fine and the Travis tests are passing. There is a pull request with this change.
Can confirm the openpyxl.formula.tokenizer Token object is functionally compatible with Koala f_token.
I have just used openpyxl.formula.tokenizer.Token as a parent class for f_token and after re-naming the three attributes across the Koala codebase it works a treat.
The head of openpyxl.formula.tokenizer.Token object:
class Token(object):
"""
A token in an Excel formula.
Tokens have three attributes:
* `value`: The string value parsed that led to this token
* `type`: A string identifying the type of token
* `subtype`: A string identifying subtype of the token (optional, and
defaults to "")
"""
__slots__ = ['value', 'type', 'subtype']
LITERAL = "LITERAL"
OPERAND = "OPERAND"
FUNC = "FUNC"
ARRAY = "ARRAY"
PAREN = "PAREN"
SEP = "SEP"
OP_PRE = "OPERATOR-PREFIX"
OP_IN = "OPERATOR-INFIX"
OP_POST = "OPERATOR-POSTFIX"
WSPACE = "WHITE-SPACE"
def __init__(self, value, type_, subtype=""):
self.value = value
self.type = type_
self.subtype = subtype
When we do this transformation:
The Koala f_type signature is now:
#========================================================================
# Class: f_token
# Description: Encapsulate a formula token
#
# Attributes: tvalue -
# ttype - See token definitions, above, for values
# tsubtype - See token definitions, above, for values
#
# Methods: f_token - __init__()
#========================================================================
class f_token(Token):
def __init__(self, value, type, subtype=''):
super().__init__(value, type, subtype)
def __str__(self):
return self.value
Currently, 2 different tokenizers are used in Koala:
koala/ast/tokenizer.pyx
)koala/openpyxl/tokenizer.py
).We need to merge the 2 into one to avoid complexity.