Closed ankushbhardwxj closed 4 years ago
The cache.toml file to be used in mason-registry after merge of #48 uses TOML table headers as [LocalAtomics."0.1.0"] which is invalid.
I believe [LocalAtomics."0.1.0"]
is valid TOML - is it just not supported by our TOML parser?
@ben-albrecht Yeah, it isn't I got an error while integrating the cache with mason search
. I believe we are unable to parse " "
in the header.
I wonder how hard this would be to fix in the TOML parser.
I believe [LocalAtomics.0.1.0]
is represented as a hierarchy of tables:
LocalAtomics
0
1
0
I suggested the quotes approach to make it a single table.
Does a fully quoted name work, e.g. ["LocalAtomics.0.1.0"]
?
I wonder how hard this would be to fix in the TOML parser.
I believe we need to fix the regex string which is used to parse the headers.
I had locally changed to [LocalAtomics0.1.0]
and it did work correctly for getting the score.
This wasn't treated as :
LocalAtomics0
1
0
Does a fully quoted name work, e.g. ["LocalAtomics.0.1.0"] ?
Nope that didn't work. I didn't get an error message but the score was not parsed.
It works without the .
between LocalAtomics and version.
I had locally changed to [LocalAtomics0.1.0] and it did work correctly for getting the score.
Based on the TOML spec, I believe this is incorrect behavior.
I feel mixed about using that as a work-around because it relies on incorrect behavior in the current implementation of Chapel's TOML parser, and it will cause issues when that gets fixed.
I propose we take one of the approaches:
1) Fix this bug in the TOML parser so that we can use Pkg."0.1.0"
as a table name, and so that Pkg.0.1.0
creates a hierarchy.
2) Implement a work-around that doesn't depend on existing incorrect behavior. For example, we could use a different delimiter for the versions: [Foo.0_1_0]
.
(1) is preferable, but I'm not sure how large of an effort that is. @Spartee may be able to weigh in.
To double check, I tried Python's toml package to validate my understanding of the expected behavior:
>>> toml.loads("[foo.0.1.0]")
{'foo': {'0': {'1': {'0': {}}}}}
>>> toml.loads('[foo."0.1.0"]')
{'foo': {'0.1.0': {}}}
So after a quick look I think some of the issue is how we generate tokens in the TomlReader
. Effectively everything is turned into tokens with the following gigantic regex string
const doubleQuotes = '(".*?")', // ""
singleQuotes = "('.*?')", // ''
bracketContents = "(\\[\\w+\\])", // [_]
brackets = "(\\[)|(\\])", // []
comments = "(\\#)", // #
commas = "(\\,)", // ,
equals = "(\\=)", // =
curly = "(\\{)|(\\})", // {}
dt = "^\\d{4}-\\d{2}-\\d{2}[ T]\\d{2}:\\d{2}:\\d{2}",
ld = "^\\d{4}-\\d{2}-\\d{2}",
ti = "^\\d{2}:\\d{2}:\\d{2}(.\\d{6,})?";
const pattern = compile('|'.join(doubleQuotes,
singleQuotes,
bracketContents,
brackets,
commas,
curly,
equals,
dt,
ti,
ld));
This bug is actually already documented in the improve TOML issue.
So this gets at a bigger problem with the TOML
module that I wish I did the first time which is to have two passes of tokenization. The first pass would seperate the TOML data containers: tables, arrays, arrays of tables(not implemented), and the second would populate the tokens within those containers. I know this is a bit hand-wavey but thats at least how I had it in my mind when thinking about how to handle these cases.
The shorter term fix, I think, is to reorder the regex arguments so that tables come before the double quotations. I haven't tried this myself though.
Another possible option would be to get fancier with the regex and have a tblName
regex that would include specific regex for table names.
Closing this issue here since it has been fixed by https://github.com/chapel-lang/chapel/pull/16091
The cache.toml file to be used in mason-registry after merge of #48 uses TOML table headers as
[LocalAtomics."0.1.0"]
which is invalid as per parsing by our Chapel's TOML module. This may be updated to[LocalAtomics.0.1.0]
or[LocalAtomics0.1.0]
.