WICG / scroll-to-text-fragment

Proposal to allow specifying a text snippet in a URL fragment
Other
589 stars 42 forks source link

Ideas for better accessibility and privacy #187

Closed ghost closed 2 years ago

ghost commented 2 years ago

Hi all!

1. Feature-name

Encrypted text with md5.

2. Feature-description

My goal here would be to have an encrypted url in order to hide the part of the information on the network.

2.1 Why?

  1. Increase scroll-to-text-fragment privacy, security.
  2. Hashing algorithms can be used to authenticate data. "This allows us to hide what text within the network we are accessing."
  3. I chose the md5 algorithm just as a demo, there are alternatives like sha256 etc.

2.2 proof of concept?

image

2.3 Demo?

Input: https://en.wikipedia.org/wiki/History_of_computing#:~:text=The%20first%20recorded,Williams Output: https://en.wikipedia.org/wiki/History_of_computing#:~:text=5584578475ca03db28e007feedc0bef9 (md5)

3. Notes

I'm not promoting or wanting to promote any product, company, person, technology, service, business. My purpose in referencing the links is for bibliography only.

4. Reference

ghost commented 2 years ago

Hi all!

1. Feature-name

Hidden and inclusive reference in text.

2. Feature-description

My goal here would be to have an encrypted url in order to hide the part of the information on the network.

2.1 Why?

2.1 proof of concept?

image

2.2 Demo?

Input: https://en.wikipedia.org/wiki/History_of_computing#:~:text=The%20first%20recorded,Williams Output: https://en.wikipedia.org/w/index.php?title=Cat#:~:text=2,31 (Hidden and inclusive reference in text, per example: 2 words, 31 characters)

3. References

ghost commented 2 years ago

Concepts

  1. https://example.com/#:~:text=character_start&character_end&elementHTML
  2. https://example.com/#:~:text=characters&words&lines&without_white_space&paragraphs&spaces&sentences
  3. https://example.com/#:~:text=md5_hash
  4. https://en.wikipedia.org/wiki/History_of_computing#:~:text=md5_hash_1,md5_hash_2
aphillips commented 2 years ago

I think I understand the proposal, but I'm not sure how it addresses accessibility or privacy concerns. I'll leave that others and comment on the I18N related stuff 😉

The concepts "words" and "lines" are problematic.

Word-breaking is language dependent. Unicode provides a "convenient default", but this is mainly only good for operations such as cursoring in which the user has low expectations in certain languages. In lengthier texts, the UAX29 word count would be disconnected from reality pretty significantly. It would also vary significantly between browsers and operating systems, making the reference useless.

Lines depend on wrapping behavior, font size, etc.

ghost commented 2 years ago

Hi all!

1. Feature-name

Better accessibility.

2. Feature-description

My goal here is to improve the accessibility of the WICG proposal in fragmented texts through syntax highlighting, speech in audio.

2.1 Why?

2.2 proof of concept? syntax highlighting?

image

2.2.1 source-code

// wrap every occurrence of text 'The first recorded' in content
// with <span class='highlight'> (default options)
$('#content').highlight('The first recorded');

2.3 proof of concept? text-to-speech? Demo?

2.4 demo?

img 1

image

img 2

image

2.4 "source-code"

3. Notes

  1. I'm not promoting or wanting to promote any product, company, person, technology, service, business. My purpose in referencing the links is for bibliography only.
  2. There are several libraries that allow you to highlight text, as well as transform text into audio and url parser.

4. References

ghost commented 2 years ago

@aphillips Hi! Thank you for feedback. I will try to clarify some ideas:

  1. The privacy issue could perhaps be solved with md5, sha256 or any hash algorithm in the text or "hidden and inclusive reference in text". Ways for you to ensure that no one will see what you are reading, accessing.
  2. The accessibility issue can perhaps be resolved with syntax highlighting in the text and/or Web Speech API. Ways to make it easier to find text through syntax highlighting, audio.
bokand commented 2 years ago

Hi there!

Perhaps I'm missing something but I'm not sure I understand the motivation. The text directive is part of the URL fragment which is not sent to the network as part of a request and the directive is stripped from the URL before script on the destination page can see it.

Perhaps you could elaborate on the problem you're trying to solve?

ghost commented 2 years ago

@bokand Hi! Perhaps you could elaborate on the problem you're trying to solve? - Privacy, Accessibility

bokand commented 2 years ago

"Privacy" and "Accessibility" are general aspects of a problem/solution but not actually describing the problem you have.

I don't see above any proposal that relates to accessibility (apologies if I missed it).

In terms of privacy, I see that you want to "hide the part of the information on the network." but in what situations? As I mentioned, when the browser makes a request to a URL containing the text directive, the fragment isn't sent to the server so the text fragment isn't visible on the network so I don't see what problem this would be solving.

Do you have a concrete example of a user flow that this would address? i.e. what steps does the user take? what happens in the system (what network requests/responses are made)? what is the threat/risk that's occurring?

As an example of why it's difficult to evaluate in this general form: the proposal to use a hash instead of the plain text only helps (assuming an attacker can see the fragment) if the page in question requires authentication. If it's a public page, an attacker could capture the hash, load the same URL + hash in their own browser, discover the text.

Additionally, a hash is a one way function. In order for the browser to then determine what text it should highlight it would have to try hashing every n-gram of words on the page which seems impractical.

ghost commented 2 years ago

@aphillips @bokand Hi! thank you all for feedback. please, close this issue. everything has been resolved - you all clarify all doubts. So... really, this feature doesn't make any sense.