Closed skeet70 closed 11 years ago
I've created a prototype for how our token table should work. It runs on these two principles:
This allows us to pass all of our codes through the Token non-concrete type, but gives us the flexibility of pulling a code out of Token and determining what type of code it is simultaneously.
I've written an example on how to use our new token table:
https://gist.github.com/4661593
I'll continue developing on this table, as I'd like to have helper functions that will pull out the string or the type to make code easier to read. But you can start peppering your code with Tokens, now.
Edit: If you want to start using it right away, I've pushed the branch containing the prototype to the repository. You can find it under "TokenTable_23"
I've pushed to develop what I think is going on, but I don't really know. If I'm doing it wrong, let me know what needs to change, or how this is supposed to work. I can't see TokenTable_23 anywhere on here, so I'm just guessing at what the function types are supposed to be.
Once a working Token type or table or something has been pushed/implemented, I'll keep working, right now I can't compile, and don't want to make a bunch of changes and add stuff just to go back and refactor it all.
I believe I've finished with the TokenTable, including the helper functions to obtain different parts of a token (such as the type of token, or token itself). Please see this Gist for examples on how to obtain different parts of a token.
https://gist.github.com/4661593
Also, I've modified the digitFSA function to reflect handling of the new token data. Please see that for clarification on how to modify your own FSA's as well.
Here are the advantages, or why I've written the TokenTable in the way that I have.
Improved type checking - If we write a function that we intend to return an error code (MP_ERROR) and accidentally confuse it with a different token (MP_AND), the compiler will not allow this, because we need to precede each token with its subclass. In this example, we would say:
let token = ErrorCodes MP_ERROR let otherToken = ReservedWords MP_AND
Therefore, we have fewer errors to deal with.
For those of us that would prefer to not use the meta-data functionality, we can easily use the helper functions defined in TokenTable.hs to just give us the token name. As such, if you want the token itself, you only need to call "unwrapToken newToken" as opposed to just "newToken".
Which brings me to my next point.
Helper Functions
unwrapToken: Give this function a token, it will return the name:
let newToken = ErrorCodes MP_ERROR
unwrapToken newToken
"MP_ERROR"
getTokenType: Give this function a token, it will return the subclass:
let newToken = ErrorCodes MP_ERROR
getTokenType newToken
"ErrorCodes"
I'm pretty flexible on the names of these data types, as "ErrorCodes" might seem ambiguous being associated with "Codes". However, other than that, I'm convinced that this is the way to go.
TokenTable module is tested and complete. Let me know if you have any questions, concerns, or if you find any bugs.
I'll also add this TokenTable file to the Wiki.
Judging from descriptions in class and on the milestone, the reserved word table is part of the Scanner, or is checked immediately after the Scanner and before the Parser. This needs to be implemented so that the format of our printout is accurate.