w3c / vc-bitstring-status-list

A privacy-preserving mechanism to publish status information for Verifiable Credentials.
https://w3c.github.io/vc-bitstring-status-list/
Other
22 stars 19 forks source link

"bitstring" vs "bit string" #127

Closed TallTed closed 5 months ago

TallTed commented 8 months ago

(accidentally raised as VCDM#1412, belongs here)

I think we're only using "bitstring" — except in the title of the README! — to date. (200,000+ Google hits) ...

There appear to be many more instances of "bit string" in the wild (1,000,000+ Google hits).

I'm inclined to think we should switch to the latter, but I will live with whatever the WG decides.

Post-CR is probably OK for this.

TallTed commented 8 months ago

Looking further... as this may trigger changes from Bitstring to BitString in many locations, this may be more than editorial.

msporny commented 8 months ago

I did a bit of research on this before settling on "bitstring". To be sufficiently pedantic, both forms are acceptable:

https://en.wiktionary.org/wiki/bitstring https://csrc.nist.gov/glossary/term/bitstring https://csrc.nist.gov/glossary/term/bit_string https://en.wikipedia.org/wiki/Bitstring

... and as @TallTed said, "bitstring" is /less commonly/ used:

https://books.google.com/ngrams/graph?content=bitstring%2Cbit+string&year_start=1800&year_end=2019&corpus=en-2019&smoothing=3

I was optimizing for a few things when settling on "bitstring":

These are admittedly weak arguments for "bitstring". The arguments for "bit string" are (arguably) as weak.

I blame the English language. :)

... and yes, this would be a normative change and would ripple across all of the documents. The non-space version is used /everywhere/ ... vocabulary, context files, specification, VCDM v2 context, code, etc. If we are going to make this change, we need to make it now.

My preference would be to just leave it as-is; changing it is lots of work for very little (if any) gain.

TallTed commented 8 months ago

I note that there is a competing (and mixed!) history in this repo on this --

As I said, I'm not going to lay down on this hill, but I'm pretty well convinced that bit string is the more formally correct form (and I don't think SEO matters much on this front).

I note that Wikipedia has redirects from both bit string and bitstring to bit array, which entry says it is also known as bit string and does not say it is also known as bitstring.

Further, Wiktionary's competing entries for bit string and bitstring are both incomplete, and neither acknowledges that the other exists.

I also note that searching the OED for bitstring gets you one result — bit string which entry does not refer to bitstring.

NIST has 9 definitions of bit string and 5 definitions of bitstring, coming from various of their documents.

Whichever we're going to use, we should make it consistent throughout our docs and repos ASAP, and once achieved, we must make every effort to maintain that consistency going forward.

iherman commented 8 months ago

The issue was discussed in a meeting on 2024-01-16

View the transcript #### 2.6. "bitstring" vs "bit string" (issue vc-bitstring-status-list#127) _See github issue [vc-bitstring-status-list#127](https://github.com/w3c/vc-bitstring-status-list/issues/127)._ **Brent Zundel:** We have time to talk about this. **Manu Sporny:** As pointed by TallTed, there are two ways to refer to what we're explaining: "Bit string" vs "Bitstring". … Both forms are fine. One is more popular. … SEO, typing, and other arguments are weak. … The changes require a lot of work from the editorial standpoint, for not much to gain. **Ted Thibodeau Jr.:** Having done some painful research on the PR on this, we started from "Bit String", and it was arbitrarily changed to "Bitstring". … From web searching, both exists. The one word, redirects to the two word. > *Dmitri Zagidulin:* if we're the only ones that use 'bitstring'... that's perfect SEO! that's what we want! **Ted Thibodeau Jr.:** They are usually consistent within a doc. It will be painful, and I grant that it will be painful at some point. … The two word is more correct. **Dmitri Zagidulin:** I posit that the one word is a feature, not a bug. **Joe Andrieu:** I'm on the fence because I like plain English. But branding wise, bitstring seems better. **Manu Sporny:** I don't disagree. … Given I just found out we haven't submitted PING nor privacy, it could be we're sitting in limbo for 3 months. That's enough time to make the change. > *Ted Thibodeau Jr.:* as far seo goes, `vc bit string` (note, no quoting, as most web searches go) is going to find them `vc bitstring` and `vc bitstring` is going to find them `vc bit string` .... the thing they really want SEO to work for is Verifiable Credentials, not bit string. **Manu Sporny:** We just need closure. Either keep, or change it and do it. **Brent Zundel:** The question on the table is: do we keep or change? If we change, who does the work? **Dave Longley:** While we have the time, implementations are looking for stability. **Ted Thibodeau Jr.:** As far as SEO goes, both the one word and two words will show up in the search results; because most people don't put searches in quotes. … They're likely going to put VC right next to the query, and are likely to quickly find what they're looking for. **Brent Zundel:** We've spent enough time on this today. … Anything else? **Joe Andrieu:** Would it need to be hyphenated when using the two words? **Ted Thibodeau Jr.:** Both are nouns, so that could be confusing. Everything I've found treats it as an adjective. … It could be hyphenated, but nothing I've found uses it that way. **Brent Zundel:** That's it, thanks all. Looks like we're on track except for the PING & security reviews for bitstring. ---
msporny commented 6 months ago

It's been 3 months, we're planning on going into CR in 3 weeks. @TallTed, I know this is not ideal, but changing things at this point feels like we're asking for trouble. Can we move ahead with just using "bitstring" consistently everywhere?

To recap, changing from "bitstring" to "bit string" would require us to:

Some of that might happen automatically, but most of it will need to be done manually. It doesn't feel worth it to me to do the rename at this point. If the WG decides to do it, then I'll do it, but I'd much rather spend my time on more critical things that need to be done.

This is the last "before CR" issue we have left that doesn't have a PR raised for it.

TallTed commented 5 months ago

As noted earlier, I won't lay down on the hill of which we use, though I do feel strongly that bit string is better.

Consistency within our documents (so readers of one of our documents knows what to look for in the others) is more important than matching all external uses, which is impossible because both bitstring and bit string are used elsewhere.

I will live with just using "bitstring" consistently everywhere in our documents.

msporny commented 5 months ago

I have performed a "git grep -i "bit string" through the entire repository to see if we had any remaining usages of "bit string". We don't, therefore I'm asserting that we're done and we can close this issue. If we find any further inconsistencies, we can raise a new PR to fix those (or just fix them inline if they're small enough).

Closing.