unicode-org / unicodetools

home of unicodetools and https://util.unicode.org JSPs
https://util.unicode.org
Other
51 stars 39 forks source link

We need to internally identify a file as BETA in the comment #813

Open asmusf opened 5 months ago

asmusf commented 5 months ago

Now that we no longer have suffixes for property file drafts it is essential that we identify non-final files internally, possibly with something as simple as the word "BETA" in the first line.

Or something like "This is a BETA REVIEW DRAFT" on a line by itself.

The reason I'm adding this issue here, is that it may be possible to put this boilerplate into tooling.

Now, the platinum solution would be a text block that also puts the link to the BETA PRI in each file, so something like

This is a BETA REVIEW DRAFT. For details on the beta review process and how to submit feedback see PRI nnn \<link>.

That way, anyone receiving a copy of this file can immediately tell that

If this is built into the beta script, it should be possible to add/remove that with a single instruction. (I maintain a similar large collection of data files that also have a public review phase and resorted to have the line %STATUS% in each source file and the release script changes that to "internal draft", "public review draft (with instructions)" or "" (empty string for the final release)).

Some scheme like this seems indicated. (Could use that same method for internal or alpha drafts as well)

PS: a "cute" way of doing this, would be to make the placeholder the line

# This file is an internal draft.

and then replace/remove that for alpha/beta and final as appropriate.

markusicu commented 5 months ago

We publish alpha and beta files in a location whose path contains "draft". This indicates the status at least as well as a version suffix. Some of the readmes that are next to the data files also say whether they are draft or final.

I have not heard from anyone that they were confused about the status of these files, and don't want to create unnecessary work.

I suspect that the previous presence of a version suffix was not giving much of a signal to UTC outsiders that that suffix meant "draft".

CLDR and ICU publish data files, but we don't go out of our way to switch their contents between "draft" and "release" all the time. And again, I am not aware of people being confused.

Some files are maintained manually. Several files are generated by tools from Ken or Roozbeh etc., not these Unicode Tools here.

I suppose that we could add some kind of placeholder into each file which is replaced by the production script. But it feels like a solution in search of a problem.

asmusf commented 5 months ago

If a colleague sends me a file, it doesn't help that it came from a "draft" folder.

My browser opened it from a remembered location and I did not spot the "draft" in the address bar (happened to me right now).

So that is not a robust or professional way to handle this.

Finally, any copy I make to my machine, I will have to verify against the online one to make sure I didn't accidentally grab a preliminary version.

Not being upfront about a version (or draft status) and hiding behind date stamps and folder locations is never good.

asmusf commented 5 months ago

From a discussion with @Ken-Whistler during the edcom meeting, I understand that you have some ability to supply copyright and similar information without actually editing all files. I'm simply suggesting to do something like that.

For example, in my other project, just by setting a pointer to a single file, I can insert the following notice into every one of over a hundred published files.

This is a DRAFT document released for public comments and not final. Please see the announcement on the ... website for public comments on ... for details on how to submit comments.

Not only does any recipient know that this file is not final, but there's a succinct reminder of what to do with it. On publication that gets replaced with an empty string, and all is good.

If a subset of files aren't maintained using a workflow that makes that possible, that's not a reason to do nothing, in my view. Same if some files lack all comments for historical reasons.

asmusf commented 4 months ago

My browser opened it from a remembered location [or from some other link that was clearly did not contain "draft"] and I did not spot the "draft" in the address bar [when reading the file] (happened to me right now).

(edits added)

At first, I thought this was just me, but then I realized that I had clicked on a link that did not have a "draft" in it which I had found somewhat surprising. But then I read this in a recent pull request comment:

The link checker complains about https://www.unicode.org/Public/UCA/16.0.0/ etc. because there should be redirects from there to /Public/draft/UCA/ but they seem to be missing. I sent ... an email about that.

I understand why such redirects are useful, but it means that we cannot (or definitely should not) rely solely on the live path to indicate the draft status of a file. For example, if someone jumps the gun and tries to access a data file with the final link, they will, before the end of the beta, get a beta file that (other than date stamp) gives no indication it's not final.

We can (and should) do better.

markusicu commented 4 months ago

... I read this in a recent pull request comment:

The link checker complains about https://www.unicode.org/Public/UCA/16.0.0/ etc. because there should be redirects from there to /Public/draft/UCA/ but they seem to be missing. I sent ... an email about that.

I understand why such redirects are useful, but it means that we cannot (or definitely should not) rely solely on the live path to indicate the draft status of a file. For example, if someone jumps the gun and tries to access a data file with the final link, they will, before the end of the beta, get a beta file that (other than date stamp) gives no indication it's not final.

Note that the redirect is visible. The address bar will change to ".../Public/draft/...".

asmusf commented 4 months ago

Note that the redirect is visible. The address bar will change to ".../Public/draft/...".

Yes, there's that subtle hint. But I argue that hiding behind a dynamically changed address bar is no way to run a standards organizations duty to be very explicit about what is a draft and what is final.