Closed dokterbob closed 11 years ago
As discussed, it would be interesting to have the contents of the article that is pointed to be included in the feed. Sometimes this may be the content of the page itself (http://www.rijksoverheid.nl/nieuws/2012/11/28/nieuwe-maatregel-in-strijd-tegen-kinderpornografie.html), sometimes it's the content of the PDF that is linked on that page (http://www.rijksoverheid.nl/documenten-en-publicaties/kamerstukken/2012/10/03/voortgang-aanpak-kinderpornografie.html).
The content should be like this:
Alternatively, (the first page of) the PDF should be embedded or the contents of the PDF itself, whatever is more easy to implement.
We are using Newspeak with a variety of clients, including:
The approach for this will be:
A similar approach will be taken when no textual content is supplied in the description/summary. Perhaps we need an optional switch allowing overriding of existing description/summary fields? Alternately, we could place the crawled textual data in an optional 'content' field for inline display in the feed reader.
Above mentioned functionality is implemented and tested for PDF files in government documents. Output also works.
Many of the feeds' pages actually contain just a PDF file. The aim is to include these as an 'attachment' in the final feed so they can be automatically included in feed readers.
TODO