Closed ProfJanetDavis closed 5 years ago
Or in any other hidden component.
Is there a way to test if a text node is visible on screen? See https://stackoverflow.com/questions/19669786/check-if-element-is-visible-in-dom
@ProfJanetDavis Currently the extension just checks a big chunk of text HTML for stopwords, right? I'd like feedback on my plan so far, to make sure I'm not missing an obvious solution with lower overhead.
<p class='ignore-this'>Just this content here.</p>
That should fix the link specific issue and some others.offsetParent
or something similar.Yes, the extension currently checks the entire document body for stopwords: https://github.com/glam-lab/degender-the-web/blob/6af6f9826c36e6677e8515af94ddad058a98f74b/src/main.js#L85
I just wrote a very long screed which got deleted when I accidentally closed the tab. I'll take the opportunity to write something more concise instead.
The algorithm you propose above raises concerns for me. "Every element in the body" includes elements nested inside of other elements. That may be a problem for both efficiency and correctness. Do you see why?
One alternative is to follow a strategy similar to that used in replaceWordsInBody()
:
textNodesUnder()
to get all the text nodes in the body.isVisible()
, similar to isEditable()
, to check whether each text node is visible. Call this function either before or after checking the text node's contents. (Checking a parent node's type and attributes repeatedly should be faster than searching a child node's text repeatedly.)I'm pretty confident you could make this work.
Another possibility might be to use TreeWalker
directly. I didn't do that in replaceWordsInBody()
because weird things happen if you change the document tree while walking it. Hence the strategy above, where textNodesUnder()
uses a TreeWalker
to collect all the text nodes in an array before making any changes to the document. But for the problem posed in this issue, using TreeWalker
directly might be an elegant solution. I haven't investigated in depth or thought it through carefully - you'd have to do that.
What do you think?
One further thought: Don't change the line I quoted above (or the mentionsGender()
function). That's meant to be a very fast check on whether there's anything in the body worth examining more carefully. Any code that walks the document tree should be inside that conditional, similar to the final check for gender pronouns.
I fear that main()
is getting far too complicated. If you see opportunities to refactor, by all means make a proposal.
This all looks good. I'm working on a solution that uses textNodesUnder
, and I've written isVisible
and a new function visiblyMentionsGender
. Do you know of a page that produces the issue? I'd like to use it for testing.
The page that surfaced the issue was test.html
, when it linked to some articles it no longer links to. I changed it so it would be useful for the purpose I made it for. I have seen the issue arise since then, but I'm afraid I didn't note the URLs. (I have been trying to get better at that.)
Here's a contrived HTML document that illustrates as many concerns as I can think of. Please forgive my technical and conceptual sloppiness.
<!DOCTYPE html> <html> <body> <div style="display:none"> The previously selected gender is <span id="storedGender">unknown</span>. </div> <script> var gender = document.getCookie("gender"); document.getElementById("storedGender").innerHTML = gender; </script> <p> <a href="https://www.independent.co.uk/life-style/women/women-workplace-gender-pay-gap-maternity-leave-feminism-a8979581.html" >More women go to university than men, so why is there still a pay gap?</a> </p> <form> <p> <!-- This should really be "gender", but the client insisted on "sex" --> <label>Sex:</label> <select id="genderOption"> <option value="female">Female</option> <option value="male">Male</option> <option value="other">Other</option> <option value="unspecified">Prefer not to say</option> </select> </p> <p> <label>Your comments:</label><br/> <textarea id="comments" cols="80" rows="10"></textarea> </p> <input type="submit" value="Submit"> </form> </body> </html>
Feel free to add any other causes of "non-visibility" you can think of. I'll let you know if I disagree. For this purpose, I would consider any text that can eventually be seen by scrolling to be user-visible.
I wish I already had those automated end-to-end tests in place!
I actually just noticed that the third answer includes someone else's definition of isVisible
which you might want to compare to yours: https://stackoverflow.com/questions/19669786/check-if-element-is-visible-in-dom
I've implemented a solution that works for the HTML above using only offsetParent
. The extension still works normally for other cases. As the Stack Overflow answer suggests, this is faster but less complete than using getComputedStyle
. I'm inclined to stick with this low-overhead solution until we can find a case where it fails, but I'm sure I could do it another way without too much trouble.
Okay, sounds good. I looked for a pull request and I don't see one yet. Do you want any help with testing?
I'm going to try testing on my own, then come back with questions once I know what I don't know.
Excellent!
-- Janet Davis Sent from my "smart" phone, in haste, with auto-(in)correct
On Sep 4, 2019, at 8:14 AM, Ian Hawkins notifications@github.com wrote:
I'm going to try testing on my own, then come back with questions once I know what I don't know.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.
Since the href of a link is not visible on the screen, this is a confusing behavior.