Closed jonpincus closed 3 years ago
It should be shipped as a separate bundle of rules 🙂
Doing it as a separate bundle isn't ideal because the results are related. If the image isn't listed in the header tags but is returned from the body (for example with a rule like $('article img[src]') ) then the alt text needs to be taken from the same element. If that alement doesn't have alt text, then letting the search progress to other rules would lead to incorrect alt text being returned.
I think that is just an implementation detail.
you can lookup for meta[property="og:image"]
and also for meta[property="og:image:alt"]
so you can correlate both values
I wrote a simple version that just gets it from og:image:alt and twitter:image:alt fields (if present) ... alas they're only there for about 1/3 of the pages I tried it on.
Trying to go farther, I'm not sure how to do these correlations -- the rules in the existing packages I looked at don't have any examples like selecting an element base on the output of a previous bundle. I want to say something like $('img[src=_][alt]')
where _ is whatever the image bundle returned but am not sure how to go about it. [Although who knows how often the image from the meta fields actually shows up in the article contents.]
If you want to check for non-empty values, I think it should be something like this:
https://codepen.io/starikovs/pen/tngqH
But what I recommend you is to start from something more basic; we can improve the rules bundle over time.
Just creating a rules bundle that groups the alt
version of these rules sound like a good start to me:
https://github.com/microlinkhq/metascraper/blob/master/packages/metascraper-image/index.js#L10
Feeling at the end you can't correlate image/image-alt values at all; although both things are related, HTML markup is a jungle. Even just an alt without an image is semantically valid, plus the low presence of the selector tells me that maybe we can't be so strict as we want there.
Also, I recommend you start just with a plain list of alt selectors because we can test them against our integration tests:
https://github.com/microlinkhq/metascraper/tree/master/packages/metascraper/test/integration
so we can have a global vision of how these selectors are used.
Not seeing clear the direction, closing for now and it will be revisited in the future
Prerequisites
package.json
.Subject of the issue
For accessibility purposes, I want to be able to show the alt text of images returned by metascraper. It would be great to upgrade the image rules bundle to provide this (since pretty much any situation where you need an image needs alt text) -- or alternatively have a separate rules bundle if that's the only way to do it.