acl-org / acl-anthology

Data and software for building the ACL Anthology.
https://aclanthology.org
Apache License 2.0
426 stars 284 forks source link

Announcement of updates / RSS feed #358

Closed evanmiltenburg closed 5 months ago

evanmiltenburg commented 5 years ago

Updates to the Anthology used to be announced through this Google group, but it seems that's no longer the case.

Would it be possible to provide some kind of updates, either through mail or RSS? I'd imagine that RSS is relatively easy, since it could be generated along with any other updates of the anthology.

akoehn commented 5 years ago

RSS would be nice, but the repository would need some kind of NEWS file (or directory) to generate the feed from. There is currently no machine-readable list of relevant changes; the git commits cover all kinds of stuff and only some of them are relevant to a wider audience.

mjpost commented 5 years ago

Min turned the keys of that list over to me but I haven't been using it. I didn't expect that people actually followed it!

I would love to have this done automatically, and an RSS feed is a good idea. I have discussed the need to note publication dates (#319), so this could fit in with that. It could work by having a script that parsed all the XML, pulled these dates out, sorted them, and then generated a chronological RSS feed from them, as part of the hugo build.

evanmiltenburg commented 5 years ago

Ah yes, I've been using it to keep track of all proceedings, usually posting them to Twitter as well. So I thought the proceedings for many conferences were just delayed, until I saw that they were actually uploaded. (Relevant XKCD: https://xkcd.com/1172/)

Related: a Twitter bot for the proceedings would also be cool, but if the RSS is there then that should be easy for interested parties to build.

mjpost commented 5 years ago

I think I have an idea of how to do this, and plan to add an RSS feed, once I manage to get #317 done (currently on hold in PR #324).

mjpost commented 5 years ago

@mbollmann (or anyone) do you have any thoughts on how to do an RSS feed with Hugo? I know there is built-in tools for this, but they assume a blogging format that I don't think will apply. I have a feed.xml file and was planning to use it as a template, filling it with all volumes and papers which are tagged with an ingestion-date attribute. I welcome thoughts on this.

mjpost commented 5 years ago

FYI, for anyone maybe interested in this Volumes (<volume> tags in the XML) now have an ingest-date attribute. It would be easy to loop over these and generate an RSS feed from volume that have this tag, updating it each time the site is built via a root-level feed.xml Hugo template.

mjpost commented 4 years ago

Just to re-up this: This should be a simple matter of writing a Python script that searches for all date-tagged volumes in the Anthology, and fills an XML template.

mbollmann commented 4 years ago

It might be sensible to integrate this with Hugo though, as this seems very much related to #722.

akoehn commented 4 years ago

This misses all corrections we make. The corrections are probably only if interest to dblp and the like but maybe they want to be updated as well.

mjpost commented 4 years ago

Revisions now also require a date, so those could be added to the script.

It'd be great to have this be Hugo native, but I have much less of an idea how to go about that.

mbollmann commented 4 years ago

It'd be great to have this be Hugo native, but I have much less of an idea how to go about that.

Basically by creating a Hugo page for every update (e.g., under /updates/) and writing a Hugo RSS template that produces the feed in the desired format.

The advantage would be that users could easily browse the updates on the website, we could automatically show the latest N updates on the front page, and it could be combined with actual blog posts as the underlying mechanism will be exactly the same.

mjpost commented 4 years ago

This is perfect, a really nice idea.

mjpost commented 4 years ago

Related to this, it would be really nice to offer RSS feeds for authors. Generating an Atom XML file (e.g., like arxiv does) probably wouldn't be too difficult.

akoehn commented 4 years ago

@mjpost with corrections I meant something like fixing a name spelling in the XML, not a correction to a paper. If this is intended for ingestion by dblp or the like, they probably want to update the metadata they scraped from us.

CSchoel commented 1 year ago

Hey there. :wave: I just wanted to ask if there is still interest in this feature. If this is the case, I would like to help out and implement it. I don't want to give any promises yet, but I might be able to work on it next week.

mjpost commented 1 year ago

Yes, this would be great! It would help us consolidate some of our documentation, and would make it easier to announce things like new ingestions.

This is a very old issue, so much has changed underneath. I think the basic need at this point is:

CSchoel commented 1 year ago

Nice! Thanks for the swift reply and the update on what needs to happen. Are there any other guidelines or starting points I should be aware of other than the README_detailed.md and the Hugo documentation?

mbollmann commented 1 year ago

It might be worth keeping in mind that some details in the docs may be outdated, so if anything is unclear, just ask here. :)

CSchoel commented 1 year ago

Thanks again. :smiley: I started with writing a few issues for myself today, but didn't get to coding anything yet. There are some open questions in https://github.com/CSchoel/acl-anthology/issues/2 and https://github.com/CSchoel/acl-anthology/issues/3, though.

mbollmann commented 12 months ago

This feature is coming together now in #2744 (thanks @CSchoel!), with a preview being generated through #2859 here:

It implements both the RSS feed and #722.

@mjpost and others: I think this is ready for your comments. It might need some more tweaking (e.g. layout of news section, exact content of RSS feed, ... also, the RSS feed is currently invalid XML due to an unescaped &, hope to fix that soon). But it looks fully usable at this stage, and I'd love some other pairs of eyes to look at this than my own. ;)

CSchoel commented 12 months ago

The issue with the unescaped & should be fixed once the preview is updated to the current branch version. See https://github.com/acl-org/acl-anthology/pull/2744#issuecomment-1793083514.

CSchoel commented 6 months ago

Hey @mbollmann @mjpost. :wave: It's been quiet for a while here. If there is anything that you need from me, please let me know. I don't have a lot of time on my hands, but I still want to help to get this done. :muscle: :smile:

mjpost commented 6 months ago

Hi @CSchoel, thanks for following up. This is my fault. I'll review this as soon as possible and get back to you. In the meantime if you want to merge in the latest master that'd be great.

mbollmann commented 6 months ago

@mjpost I merged in master in our duplicate of @CSchoel’s implementation #2859 where it also generates a preview.