mozilla / page-metadata-service

DEPRECATED - A RESTful service that returns the metadata about a given URL.
Mozilla Public License 2.0
19 stars 8 forks source link

Ignore NSFW content from Tumblr #97

Closed pdehaan closed 2 years ago

pdehaan commented 8 years ago

It's only a drop in the bucket, but it looks like Tumblr at least returns an X-Tumblr-Content-Rating: nsfw header which we can probably filter on after getting the fetch() response.

I found https://www.tumblr.com/docs/en/nsfw, but it doesn't seem to mention headers or other options.


UPDATE(s): Tumblr's not very good at creating documentation, but this also isn't really "end-user" kinda stuff.

Looks like we may need to check for X-Tumblr-Content-Rating: adult header also. So far header value may either be "nsfw" or "adult".

  1. http://stackoverflow.com/questions/17374062/how-is-tumblrs-x-tumblr-content-rating-defined (but is old).
  2. http://webapps.stackexchange.com/questions/59169/tumblrs-current-nsfw-policy-for-adult-blogs/59248#59248
  3. https://staff.tumblr.com/post/55906556378/all-weve-heard-from-a-bunch-of-you-who-are
pdehaan commented 8 years ago

Interesting, it also looks like there is a Rating header (albeit cryptic).

X-Tumblr-Content-Rating: adult
Rating: RTA-5042-1996-1400-1577-RTA

Closest I can figure out, it's this: http://www.rtalabel.org/index.php?content=howto, which also seems to also be a <meta> tag version:

<meta name="RATING" content="RTA-5042-1996-1400-1577-RTA" />

http://www.rtalabel.org/index.php?content=howtofaq

pdehaan commented 8 years ago

Looks like certain subreddits may have an x-over18: true header to flag if they are Adult/NSFW.