issues
search
iipc
/
webarchive-commons
Common web archive utility code.
Apache License 2.0
50
stars
72
forks
source link
ExtractingParseObserver: extract rel, hreflang and type attributes
#86
Closed
sebastian-nagel
closed
4 years ago
sebastian-nagel
commented
4 years ago
cf. commoncrawl/ia-web-commons#10
add "rel" attribute to A and AREA links (used to define
link types
:
alternate
,
canonical
, etc.)
add attributes "hreflang" and "type" (MIME type) to A@/href links
cf. commoncrawl/ia-web-commons#10
alternate
,canonical
, etc.)