acl-org / acl-anthology

Data and software for building the ACL Anthology.
https://aclanthology.org
Apache License 2.0
384 stars 256 forks source link

noindex pages should not be in the sitemap #603

Open akoehn opened 4 years ago

akoehn commented 4 years ago

There are several pages somehow marked as noindex (have not looked into where they are marked).

For each of these pages we should decide whether they should be indexed and either remove the noindex or remove them from the sitemap.

mbollmann commented 4 years ago

The section pages are not directly linked from anywhere and mainly exist for technical reasons. They are marked as noindex here:

https://github.com/acl-org/acl-anthology/blob/6e7cf33233e41b36a507fac86ede9e3f9e99fdd4/hugo/layouts/_default/section.html#L2

akoehn commented 4 years ago

In that case it should be not included into the sitemap hugo generates. I don't know how to do that the best way (other than sed-ing it out from the resulting XML files).