Closed mro closed 7 years ago
looks they did a comprehensive relaunch, grr.
Yeah they rolled out a new website, the URL schemes aren't the same, and I don't see the latest available videos on each category page, just collections, playlists, series... that's neat for the consumer, but really unfortunate for us :( Moreover, while subcategories list all available videos in what seems descending order, there are way too much subcategories and they barely list just name, url, thumbnail and duration...
This is not just a loose screw, we are facing here a complete bridge rewrite if it's even possible to retrieve relevant data to make a feed. That's way beyond my skill :(
Indeed. They're not very keen on OpenData.
An approach may be, to just take today +1 / -7 days and grab http://www.arte.tv/de/guide/ and http://www.arte.tv/guide/api/api/program/de/scheduled/17-05-02 respectively.
The website is just flat-out worse. If I search for a program, such as Karambolage, it tells me neither when it was broadcast nor the blurb. I strongly disagree that this is "neat for the consumer" as I can't easily discover any of Arte's awesome programming through their website anymore. There's nothing to be found but a bunch of stupid, useless thumbnails.
/rant
well, visitors weren't stakeholders I guess. I bet the term coffeetable was used a lot in the marketing meetings.
Just had a look :face_with_head_bandage:
They certainly don't go for simplicity... and of course no feeds, because that would be too easy.
They provide a quite extensive list of categories and related shows on the main page though without much details http://www.arte.tv/fr/
I extracted the JSON data and beautified it here
Maybe we can extract feeds from that?
Example:
{
"id" : "5927e2ed96403",
"kind" : "SHOW",
"programId" : "047867-001-A",
"language" : "fr",
"url" : "http:\u002F\u002Fwww.arte.tv\u002Ffr\u002Fvideos\u002F047867-001-A\u002Ffemme-de-viking-1-2",
"title" : "Femme de Viking (1\u002F2)",
"subtitle" : "La fuite de Sigrun",
"images" : [...],
"publicationBegin" : "2017-05-26T08:10:00Z",
"publicationEnd" : "2017-06-01T03:00:00Z",
"markings" : [],
"geoblocking" : null,
"creationDate" : "2017-05-26T08:10:21Z",
"lastModified" : "2017-05-26T08:30:52Z",
"stickers" : [],
"warning" : null,
"duration" : 52,
"childrenCount" : null
}
If I search for a program, such as Karambolage, it tells me neither when it was broadcast nor the blurb.
http://sites.arte.tv/karambolage/fr/voir-et-revoir-les-emissions-karambolage
Not sure how I got there but they freaking use separate domains for certain programs :confused:
It's better in that it at least displays the date, although without the title and blurb/subtitle it's still meh. I would like to give them a shout-out for using proper pages instead of endlessly scrolling ones like on npo.nl, which makes it completely impossible to find a broadcast from five or even one year(s) ago. The only exception there is when the series has a dedicated website with an overview such as here). That reminds me that I wanted to look into writing an NPO bridge. :-P
Anyway, it would certainly help if I could somehow manage to end up on the kind of page you gave me without bookmarking it. :rofl:
PS For my use case making a feed out of those kinds of pages (however you may find them) would be more useful than a feed that gives you everything.
@Frenzie what's you usecase?
mine is using the feed as a news feed so I can skim the headlines of past programs and grab them via youtube-dl
or MediathekView in case. High volume isn't much of an issue in my case. A category-blacklist would be nice, however.
So I'm for everything but within a time window.
@mro Basically the same except I'd never heard of MediathekView and volume's definitely an issue for me. I might glance at the website occasionally to see if there's anything of interest but I don't want to see everything all the time except for a few very specific programs (i.e., Karambolage, Le Dessous des cartes and I guess I'd consider keeping abreast of new releases of that NDR show Xenius even if I have no interest in watching them all โ German Arte also had an interesting series on rivers around the world).
tl;dr I never used the existing ArteBridge functionality because it didn't serve my needs. But it didn't bother me enough to implement it myself. Just putting it out there if someone decides to rewrite it from scratch. ;-)
PS A program like MediathekView sounds somewhat redundant for, e.g., BR where all the programs readily provide download links.
PPS It's a pity that the JSON above doesn't indicate that the French version of that Femme de Viking program is dubbed from the original Die Frauen der Wikinger without loading yet another JSON.
You never guess where this leads to... spoiler ๐คฃ
They should link here instead. Actually this is more accurate ๐
Seriously now, we can make something out of the pages. As long as the data is available (and they don't guard against bots) anything is possible. Also the API request mentioned by @mro is very interesting as it doesn't require registration and uses less bandwidth.
An approach may be, to just take today +1 / -7 days and grab http://www.arte.tv/de/guide/ and http://www.arte.tv/guide/api/api/program/de/scheduled/17-05-02 respectively.
How did you figure that out? ๐ฎ
Although that link is broken, a quick search (using a search engine to search on their siteโฆ) shows that they do in fact have newsfeeds, but (I think) only in French and German. However, in those languages they don't actually link to anything from the main page, just the social media!
http://www.arte.tv/sites/services/flux-rss/ http://www.arte.tv/sites/de/services/rss-feeds/
Not all of them are functional, but the basic +7 seems to be.
http://www.arte.tv/papi/tvguide-flow/feeds/videos/fr.xml?type=ARTE_PLUS_SEVEN&player=true http://www.arte.tv/papi/tvguide-flow/feeds/videos/de.xml?type=ARTE_PLUS_SEVEN&player=true
[Edit: when you remove the &player=true it also gives you download links for all the video files pertaining to the particular language you're looking at.]
[Edit 2: when you remove all arguments you get an enormous list of programs going back to 2015 rather than just a few months โ probably best not to do that wrt drawing attention. :-P]
Edit: also, they seem to have a secret API?
Quoting from https://www.drupal.org/project/arte_opa
The documentation of this API is available here : https://api.arte.tv/api/oauth/user/documentation
Configuration After requesting access keys from ARTE, you can configure OPA access settings at /admin/config/services/opa/config.
It doesn't specify how one would go about requesting access keys.
The API command to get the list of the videos is probably https://api.arte.tv/api/opa/v3/videos?sort=broadcastBegin&limit=10 The params you can give to the API are :
In order to access the API, you just need to add the header
Authorization: Bearer Nzc1Yjc1ZjJkYjk1NWFhN2I2MWEwMmRlMzAzNjI5NmU3NWU3ODg4ODJjOWMxNTMxYzEzZGRjYjg2ZGE4MmIwOA
I have also seen the token Bearer MWZmZjk5NjE1ODgxM2E0MTI2NzY4MzQ5MTZkOWVkYTA1M2U4YjM3NDM2MjEwMDllODRhMjIzZjQwNjBiNGYxYw
but I think it is used only to access menu elements generation (internally called EmacEndpoint API)
For filtering categories, you need to add &category.code=
plus the category (same for subcategories).
It is possible to chain categories by separating them with commas.
Category IDs :
To get the subcategories, it is possible to fetch https://api.arte.tv/api/opa/v3/subcategories?category.code= + category code
I don't really use this bridge, so I think It would be better to let someone that actually used it implement the function you need. If you need any info on the APIs, just ask ๐
Should be fixed by cba65d6d087f14b2ca2fda995745ac8f5f79310d.
similar to #244 ? Rev c375ddd6ab5ec7a6
Can somebody confirm?