Closed tajmone closed 2 years ago
ID pattern mentioned in PDML and PML differ!
Well spotted! Thanks.
So which one is correct?
OMG! They are both wrong. Sorry for that.
The correct regex is [a-zA-Z_][a-zA-Z0-9_\.-]*
It's now fixed everywhere:
It's also fixed in PMLC, and the fix be included in the next public version 2.3.0.
Well spotted! Thanks.
As of lately, being buried in Sublime PML and all it's matching patterns, I immediately had a hunch that the PDML pattern you mentioned didn't quite ring correct, but of course I wanted to check it before opening an Issue.
OMG! They are both wrong. Sorry for that. The correct regex is
[a-zA-Z_][a-zA-Z0-9_\.-]*
It's now fixed everywhere:
How could this happen? Aren't such core definitions hard-coded somewhere like a constant which is then reused all over the place (in the source code and docs source alike)?
I was also wondering whether this pattern might belong to the info exposed via the generated pml_tags.json
. Currently the JSON file is structured:
{
"pml_tags":{
"pml_version":"2.2.0",
"pml_release_date":"2021-12-14",
"tags":[
{
which focuses mainly on the PML tags.
Wouldn't it be worth adding the ID pattern too, e.g.:
{
"pml_tags":{
"pml_version":"2.2.0",
"pml_release_date":"2021-12-14",
"custom_id_re":"[a-zA-Z_][a-zA-Z0-9_\.-]*", // '\' Needs escaping?
"tags":[
{
(or something along those lines) so that end users can extract the pattern either to use it in their syntaxes or documentation, or just to check the current pattern status? (after all, in the future the pattern might change).
Currently the generated JSON file focuses on the PML tags and their attributes, although it does contain some basic PML version info.
Wouldn't it make more sense to rename the JSON file from pml_tags.json
to pml_info.json
and expand its contents to also include other info unrelated to tags? i.e. as opposed to having multiple JSON files that could be generated via PMLC options?
E.g.
{
"pml_info":{
"pml_meta":{
"pml_version":"2.2.0",
"pml_release_date":"2021-12-14",
"pml_website":"https://www.pml-lang.dev",
"pmlc_repository":"https://github.com/pml-lang/converter"
},
"nodes_meta":{
"custom_id_re":"[a-zA-Z_][a-zA-Z0-9_\.-]*",
"nodes_total":45, // <-- i.e. whatever the number of PML nodes is!
// etc.
},
"tags":[
{
The above entries are just imaginary examples to make the point (although the PML links would be good candidates for a meta section).
IMO, it's better to pack all the info in a single JSON file, rather than having PMLC produce multiple info files. Since the JSON file is usually intended for machine processing, having too much info shouldn't be a problem — as long as the info is well organized in self-contained sections which can be filtered out.
Aren't such core definitions hard-coded somewhere like a constant which is then reused all over the place (in the source code and docs source alike)?
The regex should of course be defined at one place only. The reason for having it (currently) defined multiple times is that there are different types of source code files that cannot easily share data: PPL files, Java files, PML files, and HTML files.
Wouldn't it be worth adding the ID pattern too
Yes, good idea.
Wouldn't it make more sense to rename the JSON file from pml_tags.json to pml_info.json and expand its contents to also include other info unrelated to tags?
Yes, excellent idea too.
IMO, it's better to pack all the info in a single JSON file, rather than having PMLC produce multiple info files.
I agree, because it's for machine consumption. And it eliminates the risk of accidentally mixing JSON files related to different PML versions.
I suggest to create a dedicated discussion to clearly define and agree on the improved structure. After everything is well defined and thought out, it will be easy to implement it.
The regex should of course be defined at one place only. The reason for having it (currently) defined multiple times is that there are different types of source code files that cannot easily share data: PPL files, Java files, PML files, and HTML files.
That's why we have tools like mustache! You can generate those source files via mustache, so regardless of the source type you can use the same JSON/YAML data to fill in some critical parts. Wasn't the scope of the JSON file to help maintain syntax highlighters definitions and the like?
Here is the same, except that it's PL files, Java files, PML files, and HTML files — none of which matters to mustache. Of course, you'll add an extra step to your toolchain, but it's worth the price — I always use this approach for projects that require automating the documentation, because trying to handle these details manually is too error prone.
I suggest to create a dedicated discussion to clearly define and agree on the improved structure. After everything is well defined and thought out, it will be easy to implement it.
Good idea!
you'll add an extra step to your toolchain
That's why I didn't do it yet. My plan for later is to create a customized, one-click build tool that takes care of everything.
As said already, the regexes have been fixed everywhere.
My plan for later is to create a customized, one-click build tool that takes care of everything.
Done since version 3.0.0.
BUG: ID pattern mentioned in PDML and PML differ!
@pml-lang, in #56 you mentioned:
which prompted me to check the PML docs on how ID patterns are defined, and I immediately noticed that in the PML Manual and PML Reference whenever the custom ID pattern is mentioned it's:
[a-zA-Z][a-zA-Z0-9_-]*
and not[a-zA-Z_][a-zA-Z1-9_\.-]*
— i.e.:\.
mentioned in the PDML Specification is missing)._
according to PDML (but not PML).The same when invoking PMLC for tags info:
So which one is correct? the one with the
.
and_
or the one without them?