Closed djrieger closed 9 years ago
Hey guys, does anyone have some suggestions on how to implement the string table? I was thinking of something like this:
class StringTable {
private:
std::unordered_set<std::string> strings;
public:
std::unordered_set<std::string>::iterator insert(std::string item);
};
unordered_set::insert() inserts a string and returns an iterator pointing to it or, if the string already exists in the set, returns an iterator pointing to this item.
But we can't just replace Token::string_value
with an std::unordered_set<std::string>::iterator
right? What other type can we use to point to the elements in a set?
Maybe we could return a pointer to the string or something like that. Then
Yeah, regarding the pointer to a string contained in our set this sounds promising: http://stackoverflow.com/a/4070374
This sounds even more promising: http://stackoverflow.com/questions/5182122/pointers-to-elements-of-stl-containers. If I understand everything correctly, we could simply stick to a set of strings (no need for additional pointers) and keep a reference to std::unordered_set<std::string>::iterator
in our tokens since these iterators should stay valid as long as we don't remove elements from the string table.
- Check how to compare string reference equality in C++
by using the address-of operator (&): given T a, &b = a, &c = a;
all of &a == &b
, &a == &c
and &b == &c
will be true.
If I understand everything correctly, we could simply stick to a set of strings (no need for additional pointers) and keep a reference to
std::unordered_set<std::string>::iterator
in our tokens since these iterators should stay valid as long as we don't remove elements from the string table.
Almost. From http://www.cplusplus.com/reference/unordered_set/unordered_set/insert/ :
Iterator validity On most cases, all iterators in the container remain valid after the insertion. The only exception being when the growth of the container forces a rehash. In this case, all iterators in the container are invalidated.
References to elements in the unordered_set container remain valid in all cases, even after a rehash.
My current implementation in branch stringtable (which I forgot to push...) uses a mix of references and pointers. (References, since they can't be reseated nor NULL and because I have to touch less code, Pointers since references kill the copy assignment, which is needed for Token (no reseating).
Unless I find a safe and efficient way tomorrow, I'll need some help from the Lexer guys to integrate with operator/keyword lookup on Wednesday .