Closed hikavdh closed 8 years ago
For VRT you just need to have "application/vnd.epg.vrt.be.schedule_3.1+json" in the Accept header and it delivers it as json http://services.vrt.be/epg/schedules/thisweek?channel_code=O9&type=week
I'll look at it, but now I have to find some sleep. Tomorrow I have to resolve some Windows profile issues and create redundancy for next time.
I've added a no_genric_matching table to sourcematching. You can add IDs per source. I have still to implement the code. I will for those source/ID combinations disable everything but time/title matching. So also group-slots and split-episodes. If you haven't jet noticed humo.be is down! I took the opportunity to add code to make configure fail in such an instance. You force it to complete by disabling the source.
Oh, and I noticed that also for Nicelodeon the timings are off on Horizon. In the past Horizon had a bad name for accuracy, etc., but I thought that had improved. On the opposite, the three Belgium sources seem to be very much in consensus.
https://github.com/tvgrabbers/tvgrabnlpy/releases/tag/alfa-2.2.8-p20151222 This should disable genre and split episode matching for the source/channel combinations in the no_genric_matching table. CC on Horizon is already in that table. Groupslot detection disabling is more tricky, so I leave that for now.
I will add that the IDs in no_genric_matching will be put at the end of the source list for that channel. This will mean that you can not set it as Prime_source (unless of cause it's the only source. Let me know if this works or that groupslots also need to be disabled. As said that's tricky. In the merge procedure, those are taken out of the listings and later put through a separate comparison against the remainder of the other listings. I'm afraid for unexpected side-effects, so I rather leave that out.
I've also been thinking about Nickelodeon. To merge them back together while not changing chanids/xmltvids I have to introduce chanid-aliases. I can create a table for those, but I rather create it more general; a user settable option to set an alias for a channel to use as xmltvid. Next to that we can create a table with phasing out chanids to automatically create an alias on running --configure
when that chanid is found active. That way aliases for a user stay in use until he/she changes his configuration, even if we remove any chanid from that table after a certain time. I'm thinking of adding an end date to the table, so we then can remove it. Else it will get to crowded, like is starting to happen with the empty_channel list. There are already more then 10 no longer existing IDs in there.
Oh, and if you think other source/channel combinations would benefit, feel free to add them!
https://github.com/tvgrabbers/tvgrabnlpy/releases/tag/alfa-2.2.8-p20151223 Added the exclusion from prime_source and the new xmltvid_alias option. I have to work on configure.
https://github.com/tvgrabbers/tvgrabnlpy/releases/tag/alfa-2.2.8-p20151227
Added another option legacy_xmltvids
clearing further the way for pushing xmltvid_aliases through source_matching. I documented both options in the WIKI. I have to add the code for that, but I think that before we can re-merge the Nicelodeon channels, we have to wait at least a week after releasing the version. In between I set the prime_source to 1.
You got any testing done? Or did the holidays come in between? ;-)
I haven't had time for any in depth testing
I did spot this one error from tvgids.tv, not sure if it's a parsing problem for this specific program, but it has popped up on several grabs in the past few days:
Error extracting ElementTree from:http://www.tvgids.tv/tv/trips-travel/14792796 on tvgids.tv
I guessed as much ;-) Those errors come from errors in their html encoding. Most common are not encoded double quotes in titles (notoriously in Classical music titles on Brava) or tags partially placed inside tags. tvgids.tv is very sloppy. I catch some and are now and then thinking about further algorithms to catch them, but it doesn't have the highest priority and is quite complex.
In raw-output you'll find the offending text and the exact location within.
https://github.com/tvgrabbers/tvgrabnlpy/releases/tag/alfa-2.2.8-p20151230
It took me a lot of thinking, but I have made a framework for remerging chanids. See the "merge_into" table in sourcematching.
"1-nickelodeon":{"chanid":"0-89", "sources":{"0":"89"}, "date":"20160101"}}
This results in the following actions on running configure:
At present we can not jet set source 0 as prime_source for Nickelodeon as the file is also used by older versions, so 1 has to do. If not running configure and the merging chanid "0-89" is found, combined_channels is checked and ad-hoc updated and the ids from 0-89 are added to 1-nickelodeon. I also added a date field. This is not used, but gives us a clue on when to permanetly update sourcematching.json. I think after 2 or 3 months.
I added an extra message field to explain. It's not jet in the above version.
I'm getting the following error after the latest changes:
An unexpected error has occured:
Traceback (most recent call last):
File "tv_grab_nl.py", line 13101, in main
x = config.validate_commandline()
File "tv_grab_nl.py", line 2244, in validate_commandline
x = self.get_sourcematching_file(self.args.configure)
File "tv_grab_nl.py", line 2175, in get_sourcematching_file
self.channels(newch).source_id[int(source)] = id
TypeError: 'dict' object is not callable
Oops, that one I missed. I had used '()' in stead of '[]' and had copied that part over several times. I thought I corrected all. You checked without running --configure ;-) Download again in a minute
Updated the tag again. Found some more. I only tested with --configure! ;-(
I did look again at vrt.be. but what accept header do you mean? http://services.vrt.be/epg/schedules/thisweek?channel_code=O9&type=week just gives a list of available formats.
Or better said, I guess I can find how to do it in Python, but can I do it in an ordinary browser, say Firefox? Else it becomes cumbersome to write the code as I have to get the output through Python always.
I added the vrt.be channels to source channels. Can you check on the right merges? I'm especially wondering about ketnet/één+/canvas+ as they are together in combined channels and radio 2 and possibly Klara. I'll post an alfa if I have the get channels part ready
I'm not sure how to set custom headers with python, but with curl and wget you do something like this:
curl http://services.vrt.be/epg/schedules/thisweek?channel_code=O9&type=week -H "Accept: application/vnd.epg.vrt.be.schedule_3.1+json"
wget http://services.vrt.be/epg/schedules/thisweek?channel_code=O9&type=week --header="Accept: application/vnd.epg.vrt.be.schedule_3.1+json"
https://github.com/tvgrabbers/tvgrabnlpy/releases/tag/alfa-2.2.8-p20160101 In Python I just add it to the dict also containing the user agent, but it would be nice if I could just call it in firefox in stead of calling on the command-line, piping to a file and then opening the file. The above alfa contains get_channels for vrt.be. I have to work further on the listings. It for now just ignores the vrt.be ids on grabbing.
Oh, and I tagged a beta with all previous updates!
The following entries from VRT are inactive, you won't get any data from them, they are just historical entries for previous channels that either merged with another channel, rebranded or no longer exist: 04 Ketnet Alternatief 05 De Overname 14 Jazz Middelheim 15 Radio 1 Classics 30 Radio 2 De Topcollectie XL 33 Klara Jazz 42 Studio Brussel Rock It! 51 Donna (Now 55 MNM) 52 Donna Hitbits (Now 56 MNM Hits) 61 Radio Vlaanderen Internationaal 62 Radio Vlaanderen N7 één+ (Now part of a combined listing with O9 Ketnet) O7 Ketnet+&Canvas+ (Now part of a combined listing with O9 Ketnet)
The 1-ketnet-canvas-2 listing on tvgids.tv is a leftover from when there was a Ketnet+ sharing with Canvas+. It shows the same listings as 1-ketnet-op12, except it only goes out for 1 or 2 days, while 1-ketnet-op12 has more days. 1-ketnet-op12 is currently listed as in empty_channels, so we should probably list 1-ketnet-canvas-2 as empty instead since all it does is give you 1 day worth of listings for Ketnet/één+/Canvas+.
I would keep VRT's Radio 2 sources seperate from VPRO's Radio 2. VPRO just has a generic "Radio 2 Regionaal" program during local hours, while VRT includes details for those local shows. VRT's website defaults to Vlaams-Brabant (22) for Radio 2, so if you want to merge them, I would select that, just make sure to take the name from VRT so people using it know that's the regional entry for Vlaams-Brabant, and either add "Radio 2 Regionaal" in groupslot_names so it doesn't override the local shows or make VRT the prime_source.
Depending on how much details you are able to get from their API, you might want to make it a prime_source for all of VRT's stations.
This is how the sources breakdown: Eén: 0-5 1-een 5-24443943058 6-22 7-een 8-een 9-een 10-O8
Canvas: 0-6 1-ketnet-canvas 5-555680807173 6-18 7-vrt_canvas 8-canvas 9-canvas 10-1H
Ketnet/één+/Canvas+: 1-ketnet-op12 (needs to be removed from empty_channels) 5-24443943087 6-59 7-ketnet 10-O9
Ketnet Only: 8-ketnet 9-ketnet
één+ Only: 8-eenplus
VRT Radio 1 (VRT just calls it Radio 1, so you might want to do a rename to VRT Radio 1 so people know its for VRT): 7-vrt_radio_1 10-11
Klara: 7-klara 10-31
Also if you can detect the active/inactive state from the json, I would use that to determine which channels to include
{
"code": "1H",
"name": "Canvas",
"displayName": "Canvas",
"eid": "46162538",
"type": "tv",
"state": "active",
"description": "",
"radioplayerUrl": null,
"websiteUrl": "http://www.canvas.be/",
"logoUrl": "http://images.vrt.be/height100/logo/canvas/CANVAS_logo_lichtblauw.jpg",
"streamsLink":
{
"rel": "http://services.vrt.be/channel/rel/channel/streams",
"href": "http://services.vrt.be/channel/s/1H/streams"
},
"imagesLink":
{
"rel": "http://services.vrt.be/rel/images",
"href": "http://services.vrt.be/channel/s/1H/images"
},
"thirdpartyLinksLink":
{
"rel": "http://services.vrt.be/rel/thirdpartylinks",
"href": "http://services.vrt.be/channel/s/1H/thirdpartylinks"
},
"detailLink":
{
"rel": "http://services.vrt.be/channel/rel/channel",
"href": "http://services.vrt.be/channel/s/1H"
}
}
vs
{
"code": "05",
"name": "De Overname",
"displayName": "De Overname",
"eid": "05",
"type": "radio",
"state": "inactive",
"description": "",
"radioplayerUrl": null,
"websiteUrl": "http://www.deovername.be/",
"logoUrl": "http://services.vrt.be/images/height100/logos/vrt_grey.png",
"streamsLink":
{
"rel": "http://services.vrt.be/channel/rel/channel/streams",
"href": "http://services.vrt.be/channel/s/05/streams"
},
"imagesLink":
{
"rel": "http://services.vrt.be/rel/images",
"href": "http://services.vrt.be/channel/s/05/images"
},
"thirdpartyLinksLink":
{
"rel": "http://services.vrt.be/rel/thirdpartylinks",
"href": "http://services.vrt.be/channel/s/05/thirdpartylinks"
},
"detailLink":
{
"rel": "http://services.vrt.be/channel/rel/channel",
"href": "http://services.vrt.be/channel/s/05"
}
}
I saw the inactive tag, but thought it just meant 'no programming at present'. I'll add an exclusion on that tag, so we don't need to set them in empty_channels. If we now already set prime_source on 10 it means that pre 2.2.8 users will fall back to prime_source_order for determining the prime_source. It might even create errors. I think I added an ignore for not existing sources, but I have to check. If we switch 1-ketnet-canvas-2 for 1-ketnet-op12 we have to keep 1-ketnet-canvas-2 as chanid, so the xmltvid does not change. Feel free to do that. So updating source_channels[1]["1-ketnet-canvas-2"] and empty_channels. The naming of Radio 1/Radio 2 is clear by the grouping in Radio Vlaams, so I think no need to rename.
But if you think renaming better, again feel free to add those entries.
I'm wondering, they give start end end time in the seconds (GMT)
"startTime":"2015-12-28T05:00:08.000Z",
"endTime":"2015-12-28T05:05:15.000Z",
and the next starttime:
"startTime":"2015-12-28T05:05:23.000Z"
And that while they are notoriously often starting to late or to early. Sometimes more then 15 min. I always schedule them with broader margins.
Which channel is that for?
Ketnet
Is there another example of something in the future so I can compare with others?
In your example url you used type = week, this seems to always mean Monday to Sunday in the running week. Alternately type = day. I see also option view = week or month and an option I do not know what it does: cascading and option date. Did you experiment how to get data past the running week and on the syntax for date?
For today I don't see the seconds, maybe they update afterwards? "startTime":"2016-01-02T05:00:00.000Z" "endTime":"2016-01-02T05:05:00.000Z" "title":"Hopla"
"startTime":"2016-01-02T05:05:00.000Z"
But this should be accurate as it's the starting show of the day.
I didn't really experiment beyond finding the list of channels and the corresponding epg data.
I think you can use this to specify the specific days you want: http://services.vrt.be/epg/schedules/20160103?type=day&channel_code=O9
It might be easier to just keep on advancing until there's no more listings available.
However later today:
"startTime":"2015-12-28T15:16:51.000Z"
"endTime":"2015-12-28T15:42:46.000Z"
"title":"Mega Mindy"
and the next starttime
"startTime":"2015-12-28T15:45:06.000Z"
My current listing has that one from 16:25 to 16:50 CET
I wonder if it's related to how they also record pretty much every episode internally and post some of them to their kijken player and the gap is ads/filler. If so it might be safe to extend to the start of the next program. I don't have access to a stream of Ketnet, maybe you can do some checking later to see what's closer to what actually aired.
No I'll add the gap like in the original NPO list as 'add/anouncements' or 'Programmainfo en Reclame'
Does Ketnet have ads like Zapp between shows, or is the filler just promos and hosted segments like CBeebies and CBBC?
I never watch it, but één and canvas do have leading and closing adds like 'this is brought to you by....'
I'll round starttime down to the nearest minute and endtime up to the nearest minute. I think seconds will get ignored by most.
Also, I pushed the changes, along with a second one to add the empty_channels for VRT. VRT seems to have better logos for somethings, is there an easy way to include that?
Like the RADIO2_RED_RGB.png is better than the one we currently have for 7-vrt_radio_2
Thanks! They are.just run configure
I mean specify so it will use that logo instead of the current one. I already ran configure and this is what the string says:
VRT Radio 2;12;7-vrt_radio_2;;;;;;;;vrt_radio_2;;;;4;radio_vrt2.png
Is there a way to change sourcematching.json so it uses the same logo as the other Radio 2 entries by default?
11;radio2/RADIO2_RED_RGB.png
Also, I think you were looking at the wrong day, for the next airing of Mega Mindy on 1/2 I see the following:
"startTime": "2016-01-02T15:35:00.000Z"
"endTime": "2016-01-02T16:00:00.000Z"
"title": "Mega Mindy"
The show after that says this:
"startTime": "2016-01-02T16:00:00.000Z"
"endTime": "2016-01-02T16:17:00.000Z"
"title": "Welkom in de Wilton"
It seems the precise to the second timeslots are only for shows that aired earlier
For Radio 2 as it does not have a vrt id you have to add it to logo_names. The icon source is allready there so it will work for 2.2.7 users too. Change entry:
"7-vrt_radio_2": ["4", "radio_vrt2"],
to
"7-vrt_radio_2": ["11", radio2/RADIO2_RED_RGB.png"],
You're right I got lost in that long listing. But still it is 10 minutes later than in my present listing grabbed last night! So they hopefully are more accurate!
For some reason it's not accepting the change, after I run configure it doesn't have a logo at all:
VRT Radio 2;12;7-vrt_radio_2;;;;;;;;vrt_radio_2;;;;-1;
I also tried adding VRT's een+ logo to 8-eenplus since the one from Nieuwsblad is the een logo, but it still is showing the one from nieuwsblad:
een+;2;8-eenplus;;;;;;;;;eenplus;;;8;eenplus.png
n/m on the eenplus one, since it's an old entry it's on a different url
A more properly named issue to continue #49 Error with some ids