XMLTV / xmltv

Utilities to obtain, generate, and post-process TV listings data in XMLTV format
GNU General Public License v2.0
266 stars 93 forks source link

Fix #208: broken tv_grab_pt_meo #217

Closed jlbribeiro closed 8 months ago

jlbribeiro commented 8 months ago

What type of Pull Request is this?

Does this PR close any currently open issues?

This PR fixes #208.

Please explain what this PR does

tv_grab_pt_meo currently had several issues:

This PR addresses those issues, and extends XMLTV::Get_nice::post_nice_json() to accept an optional hash of HTTP headers to be included in the HTTP request; I wasn't sure whether this would feel the right approach, but given this function only appears to be used in tv_grab_pt_meo I thought it was ok. Any suggestions more than welcome.

Any other information?

I am not at all familiar with Perl; read a couple of things to understand some of the basics, what was the meaning of the $$ signature, ... I believe the changes "make sense", but I feel there might be some Perl-specifics I might be missing. Please advise if so.

This change has not been discussed on the xmltv-devel mailing list.

Where have you tested these changes?

Operating System: Alpine Linux v3.18.4

Perl Version: v5.36.1

moob158 commented 8 months ago

anyone can build xmltv.exe for windows?

I tried to build it, but give me errors on build.

I tested it without build the xmltv.exe and give me some errors on grab.

getting listings: https://authservice.apps.meo.pt/Services/GridTv/GridTvMng.svc/getProgramsFromChannels (DW_TV) : no programmes found

https://authservice.apps.meo.pt/Services/GridTv/GridTvMng.svc/getProgramsFromChannels (C11) : no programmes found

https://authservice.apps.meo.pt/Services/GridTv/GridTvMng.svc/getProgramsFromChannels (MTV00S) : no programmes found

https://authservice.apps.meo.pt/Services/GridTv/GridTvMng.svc/getProgramsFromChannels (TRAHD) : no programmes found https://authservice.apps.meo.pt/Services/GridTv/GridTvMng.svc/getProgramsFromChannels (MCMP) : no programmes found https://authservice.apps.meo.pt/Services/GridTv/GridTvMng.svc/getProgramsFromChannels (MCMTHD) : no programmes found

https://authservice.apps.meo.pt/Services/GridTv/GridTvMng.svc/getProgramsFromChannels (AFRO) : no programmes found

https://authservice.apps.meo.pt/Services/GridTv/GridTvMng.svc/getProgramsFromChannels (TRACETC) : no programmes found https://authservice.apps.meo.pt/Services/GridTv/GridTvMng.svc/getProgramsFromChannels (TRACEBR) : no programmes found https://authservice.apps.meo.pt/Services/GridTv/GridTvMng.svc/getProgramsFromChannels (MEZZO) : no programmes found

ht

and many others with "no programmes found"

honir commented 8 months ago

Nice detective work! However...

Spoofing the Origin header -- to pretend to be the Meo app when you're not -- is not something we would support in the XMLTV project.

jlbribeiro commented 8 months ago

@moob158 I had written a long response but it is probably invalid now (see @honir's comment above, and mine further below); will leave some notes though. I get "no programmes found" for DW_TV (which has an id of -1) as well, but not for other channels. Trying to grab all the channels I'm now hitting a different error related to some code that is only being exercised after this fix (Label not found for "next PROG" at /usr/bin/tv_grab_pt_meo line 579). The API contains many inconsistencies (channels with no sigla, duplicate channel IDs, ...), so it would probably require more work. But in any case, can you provide the stderr for tv_grab_pt_meo --fast --debug?


@honir I feared that, and I totally understand :P I had understood XMLTV strives to be a "good netizen", but was hoping not spoofing the User-Agent (and allowing services to block so if they wanted) passed that criterion. I guess that means this grabber is broken for good. Should it be removed, then? Seeing the comments on #208, I believe pt_vodafone might be a good enough replacement, as they probably share most of the channel grid. (And thank you for all your work on the XMLTV project, btw!)

rmeden commented 8 months ago

So adding an Origin header and keeping an XMLTV user-agent works? That's interesting... I can see the argument "Origin" is really just part of the API and if we're setting user-agent, we're not hiding who we are (and easily block-able if desired). Geoff, did you realize that user-agent wasn't changing?

honir commented 8 months ago

Origin is not part of the API. It is a HTTP header normally set by a browser (or app) to indicate the source of the request. Thus it performs a similar function to the User-Agent but at a deeper level.

By spoofing the Origin you are hiding who you are: pretending to be someone else.

Origin => 'https://www.meo.pt'

You are effectively saying, "trust me, I'm from the Meo app"...but you're not. You're really xmltv.org!

(This is why browsers do not allow manual setting of the Origin header since it opens the browser up to cross-site security issues.)

As the site is actively checking the Origin header they are clearly expecting only requests from their own app to be permitted.

jlbribeiro commented 8 months ago

@rmeden Let me clarify that I agree with @honir; Origin is a forbidden header name, i.e. cannot be modified programmatically. Browsers send it implicitly on certain contexts, and as it is not modifiable (except outside the context of a browser) it is sometimes used as a trusting mechanism (notice that Cookie is also a forbidden header name). The fact the API checks its value is almost surely on purpose (nit: but it doesn't mean the API's intent is to check "clients"; preventing CORS is more about the security of the user, in this case implicitly sending the Cookie and performing an auth'd request on their behalf, for instance). I'm fully aware I was discussing semantics above, I was just trying to move the "line" to include Origin but exclude User-Agent :stuck_out_tongue:

With that said, I'm assuming this PR should be closed (and that tv_grab_pt_meo should be removed from this project, as it is broken "for good")? @honir + @rmeden

honir commented 8 months ago

With that said, I'm assuming this PR should be closed (and that tv_grab_pt_meo should be removed from this project, as it is broken "for good")?

It will probably be removed in the next release (c. January?). We can leave it till then in case MEO have a change of heart and reinstate the previous urls (...unlikely, but you never know!)

jlbribeiro commented 8 months ago

Closing this PR; again, thank you all for this project!