Open domenic opened 9 years ago
It is clear to me. :-) But then again, I'm clearly not the target audience.
More seriously: let me describe what is going on, and then together we can figure out what should be added to the spec.
In the reference implementation, each railroad diagram is associated with a JavaScript function. The prose that follows the railroad diagram defines the code that is contained in that function.
result
corresponds to a local variable in that function, more precisely var result
. In this case, the result
variable is to be initialized to the result of one of three function calls, depending on which one matches the input. If you look at those functions, they each return an ES object.
More generally, each function takes a string as input, and attempts to match that string against the grammar rules. If this succeeds, the function is executed and a single value (generally a String or an Object) is returned. Otherwise, Failure is returned.
Yeah, OK, I think we can work together to make this much clearer :).
I'd state the functional nature of each parsing rule much more explicitly, with the inputs and outputs clearly stated in the header. For example, for https://specs.webplatform.org/url/webspecs/develop/#url, here's one strawman idea:
Returns: { scheme, scheme-data, username, password, host, port, path, query, fragment }
As for the data types being manipulated here, I think the best bet is to state that they are ES records. You may or may not want to adopt the relevant typographic conventions, e.g. result.[[Scheme]] instead of result.scheme.
You also need a very clear definition of "Let parsed be the result of parsing input according to the above railroad diagram." I am handwaving that this somehow creates a record with properties { file-url|non-relative-url|relative-url, query?, fragment? } and using terms like "is present" in steps 3/4 and something even more fuzzy in step 2. Would be good to explain that in some detail, and have "parsing" link to the definition.
I've largely adopted the proposal, with the following notable differences:
parsed
.file-url. I've not seen dash as an identifier character since I programmed in Cobol in college.Once we reach agreement on how the first parsing rule should look like, I'll proceed with the remainder of steps. Note: the Setter rules won't contain a returns: clause.
Looks good, the local variables technique is less confusing than I thought it would be, and is nice for readability.
I put the railroad diagram above the first step instead being included as a part of the first step. My feeling is that the railroad diagram may be referenced by multiple steps.
I don't feel too strongly about this, but I do think putting it explicitly as part of step 1 makes it clearer that you can't use the diagram in isolation, and that it's only step 1 of a multi-step process for the url(input) function.
"the value returned by the function x(input) when the function x is passed some part of the original input during the parsing of that original input according to the given railroad diagram"
This sounds ambiguous because of the "some part of the original input," although upon re-reading a few times I think I understand what you mean. Is there a better phrasing perhaps?
I haven't adopted the suggestion to use ES6 records. The standard is meant to be language agnostic. I wouldn't have a problem with an informative note, however.
Well, you need something to define what these object-like things you are returning and manipulating are. I suggested ES records because they're a purely abstract (and thus language-agnostic) concept. But you could copy the verbiage they use to define your own "record" type; that would also allow you to avoid the unfortunate typographical conventions ES records bring along.
Right now the main characteristics of these things (provisionally called records) are:
record
to <local-variable>
" implying that local variables will be records in some cases.record.prop
to <local-variable>
" implying that local variables will be strings in some cases.record.prop
"'s value.prop
property from record
".record
", i.e., these steps do return records (except when they return strings).Looks good, the local variables technique is less confusing than I thought it would be, and is nice for readability.
Thanks!
I put the railroad diagram above the first step instead being included as a part of the first step. My feeling is that the railroad diagram may be referenced by multiple steps.
I don't feel too strongly about this, but I do think putting it explicitly as part of step 1 makes it clearer that you can't use the diagram in isolation, and that it's only step 1 of a multi-step process for the url(input) function.
I've now worked through the second section (file-url). Note that step 3 references the railroad diagram. As do 3.1.1 and all of the other steps that reference local variables. Are you OK with this, or do you have any further suggestions?
"the value returned by the function x(input) when the function x is passed some part of the original input during the parsing of that original input according to the given railroad diagram"
This sounds ambiguous because of the "some part of the original input," although upon re-reading a few times I think I understand what you mean. Is there a better phrasing perhaps?
That is indeed poorly phrased but the best I have come up with so for. Perhaps I will come up with something better eventually. Meanwhile, suggestions are encouraged.
I haven't adopted the suggestion to use ES6 records. The standard is meant to be language agnostic. I wouldn't have a problem with an informative note, however.
Well, you need something to define what these object-like things you are returning and manipulating are. I suggested ES records because they're a purely abstract
Good point. Will look into.
I've now worked through the second section (file-url). Note that step 3 references the railroad diagram. As do 3.1.1 and all of the other steps that reference local variables. Are you OK with this, or do you have any further suggestions?
Seems reasonable. To make sure I understand, the idea here is that if the top production is taken, you do steps 3.1; if the middle, 3.2, etc.?
One other thing:
file-url(input) returns { scheme, host, path }. That means that url(input), if it takes the file-url path, will end up returning { scheme, host, path, query, fragment }. That seems in conflict with the description of url(input) as returning { scheme, scheme-data, username, password, host, port, path, query, fragment }.
Seems reasonable. To make sure I understand, the idea here is that if the top production is taken, you do steps 3.1; if the middle, 3.2, etc.?
Yes. Let me know if you have any suggestions on how to make this more clear.
file-url(input) returns { scheme, host, path }. That means that url(input), if it takes the file-url path, will end up returning { scheme, host, path, query, fragment }. That seems in conflict with the description of url(input) as returning { scheme, scheme-data, username, password, host, port, path, query, fragment }.
Different paths will indeed return different results. For some paths, the values of some of these values will remain undefined
.
With the possibility of backtracking, I don't want to have sub-rules making direct updates; but I could make the top rule do so. Meanwhile, my reference implementation uses standard ES5 objects, and the getters deal with mapping undefined
to strings.
Different paths will indeed return different results. For some paths, the values of some of these values will remain undefined.
It seems good to explicitly specify this somehow, so that the return type is consistent and matches the one in the header.
So either explicitly set/return undefineds (or is it empty string? unclear) where necessary, or have some kind of rule in the prelude saying "if a field appears in the return statement but the steps don't specify its value, it is undefined/empty string."
So either explicitly set/return undefineds (or is it empty string? unclear) where necessary
It's undefined, not empty string. At the moment, the constructor part of the spec hasn't been updated. What the reference implementation does:
In case this isn't clear, look at the constructor at the top of url.js from the reference implementation. The underscore prefixes to property names is an implementation detail (i.e., not intended to be a part of the spec), and this is to differentiate between the properties themselves and the getters and setters.
In some cases, there is a difference between null and empty string. This actually is from Anne's original spec, and retained by my merged spec.
That all makes sense. To be clear, my main concern is that I think it would be ideal for clarity and understandability if each function had a consistent return type.
In sentences like
What is result? You use JavaScript object-like notation in saying
Does that mean result is a JavaScript object? Well, then what if someone created a setter on Object.prototype for "query"? Do you want to be triggering that? What realm does the object exist in? Above it says it's returning a "JSON objection," so I'm guessing JavaScript object? (The term "JSON object" isn't really well defined...)
You may wish to be using something like an ES record instead.
More generally, it might be good to explain what these steps are meant to "return". E.g. it seems like the "url" rule is meant to return some sort of { query, fragment, port } tuple? And take as an argument a string? In general the processing model implied by the railroad diagrams is not very clear.