raptor2101 / Mediathek

GNU General Public License v3.0
45 stars 16 forks source link

ARD Sendungsliste (0-9,A-Z) and Subfolder stopped working #70

Closed marksml closed 8 years ago

marksml commented 8 years ago

As of Tuesday 23.02.2016 the plugin stopped working properly for the ARD Mediathek. Seems they changed the structure of the links. bcastId comes now first and only then the documentId. I changed it, but there seem to be more issues. Would most likely faster if someone looks at the code and the new webpage design who knows the plugins code.

Would be great if it can be fixed in time.

Thanks & Cheers, Michael

raptor2101 commented 8 years ago

there you go ... weekend ;)

marksml commented 8 years ago

I am looking into it already. If I have something running before you find time to look into it I will send you a pull request.

Cheers from Greifensee, CH

raptor2101 commented 8 years ago

Goto line 68 and replace it with this:

    self.regex_VideoPageLink = re.compile("<a href=\".*Video\?bcastId=\d+&amp;documentId=(\d+)\" class=\"textLink\">\s+?<p class=\"dachzeile\">(.*?)</p>\s+?<h4 class=\"headline\">(.*?)</h4>")
    self.regex_CategoryPageLink = re.compile("<a href=\"(.*Sendung\?bcastId=\d+&amp;documentId=\d+)\" class=\"textLink\">\s+?<p class=\"dachzeile\">.*?</p>\s+?<h4 class=\"headline\">(.*?)</h4>\s+?<p class=\"subtitle\">(.*?)</p>")

works for me

marksml commented 8 years ago

Thanks for the hint. I had reversed "bcastId=\d+&documentId=(\d+)" but had kept the (\d+) with bcastId. It is working with let's say section "Highlights" now, but if I try to dive in one of the letters it doesn't find any links and as consequence doesn't show anything. Might be a local bug in my code. I will get a fresh copy and start over. Thanks & Regards

raptor2101 commented 8 years ago

confirm, subsection "A-Z" doesn't work with that fix either...

marksml commented 8 years ago

wait it works, if you only do the change for line 68 (self.regex_VideoPageLink only)

self.regexVideoPageLink = re.compile("<a href=\".*Video\?bcastId=\d+&documentId=(\d+)\" class=\"textLink\">\s+?<p class=\"dachzeile\">(.?)

\s+?<h4 class=\"headline\">(._?)")

leave the cataoryPageLink pattern as is with regard to sequence of documentId and bcastId

It works for me now.

Thanks & Regards

raptor2101 commented 8 years ago

try this:

self.regex_VideoPageLink = re.compile("<a href=\".*Video\?.*?documentId=(\d+).*?\" class=\"textLink\">\s+?<p class=\"dachzeile\">(.*?)</p>\s+?<h4 class=\"headline\">(.*?)</h4>")
    self.regex_CategoryPageLink = re.compile("<a href=\"(.*Sendung\?.*?documentId=\d+.*?)\" class=\"textLink\">\s+?<p class=\"dachzeile\">.*?</p>\s+?<h4 class=\"headline\">(.*?)</h4>\s+?<p class=\"subtitle\">(.*?)</p>")
xhaggi commented 8 years ago

works thanks

marksml commented 8 years ago

Thank you very mich indeed. Have a nice weekend! Cheers

marksml commented 8 years ago

Confirmed. Thank you very mich indeed. Have a nice weekend! Cheers

gammapower commented 8 years ago

Could someone describe the solution a little bit more in detail. I'm new to this and I don't know where I should change it.

Thank you very much!

raptor2101 commented 8 years ago

You should have nothing to do. I pushed this change into the public repo and it was released fews days ago ... it should work out of the box