gerardcl / renfe-cli

Python CLI written in Rust for fast Renfe search website trains timetables retrieval
BSD 3-Clause "New" or "Revised" License
36 stars 3 forks source link

Renfe Data #194

Open gtrabanco opened 5 months ago

gtrabanco commented 5 months ago

Describe the solution you'd like Maybe you know it and you have choosen to scrape renfe site instead but there is a open json data of Renfe Schedules and stations in this link:

Describe alternatives you've considered I have considered use it but due I am living in North and Feve does not respect schedules at all it is not an option for me. But, there is a platform called "Elcano" which can provide realtime data of the trains.

I have been sniffing traffic because there is no info to use it by public people but you can scrape adif, it provides realtime just for current train which makes sense but if the train is stopped at any station waiting for other (because there is only one rail) it doesn't provide that info just update the train arriving time after some time.

The api in phones use firebase to register an id for the user (I think based on installation id) and later you can access elcano api but I am not really sure about all this process because I haven't tested yet. The difference is that in phone you can view almost real position of the train and not only the time to arrive in the station.

Additional context

Hope this info is helpful.

gtrabanco commented 5 months ago

You can view an example here:

Not sure about emulating installation process with an app but I can provide more information if you are interested in use this API instead.

gerardcl commented 5 months ago

Hi @gtrabanco!

Thanks for the infos!

I did not know about the adif API, looks really interesting, but I understand it is about status but not schedules.

Regarding the renfe data, in the past a colleague informed me about that too, I looked into it by then (and I also did review it now again) and I decided to keep on real-time scraping the site since I found out it has more accurate/up-to-date schedules.

I would love renfe site to be a modern site and avoid to launch a headless browser to scrape it, but the more this site evolves the more complex it is getting (I do not want to imagine the guys that jump time to time to maintain or add new features). They look like only applying patches instead of remaking it in a proper, reliable and secure way.

Thanks again and happy to get contributions any time! :+1:

Best regards,

gerardcl commented 3 months ago

FYI @gtrabanco --> https://github.com/gerardcl/renfe-cli/actions/runs/8774260521/job/24587873455 It looks like randomly now the page used for renfe-cli to scrape the timetables is now constantly under maintenance (since weeks now...).

I guess, if we want to keep this CLI alive we might need to finally shift to one of the two proposals you made (since I don't want to parse this crap site anymore).

Do you know if the https://ssl.renfe.com/gtransit/Fichero_AV_LD/google_transit.zip file is really accurate and trustworthy?

gtrabanco commented 3 months ago

Hi.

I can't say anything about renfe because I only use from time to time Feve and is not trustworthy at all (Feve). Everytime I have to use ADIF application and even ADIF is not really accurate due some trains are cancelled every time.

I suppose those are official schedules but real life depends on how renfe/rodalies work in your region. As I said Feve is a mess, I suppose maybe there should be some good services but in the village I live, sadly, not at all 😅

Important, the example in the link I provided to MadridTransporte project does not work I suppose that there should be some token generation/authentication that should be resolve. I used HTTP Analyzer to check ADIF solution in my phone and it generates a user per app id (I think, not sure). It is using firebase and there is a possibility that firebase it's not well configured and we could access the data without auth or with permanent token. Because I am not needing this I paused the project because I want this to integrate it in home assistant when I'll need it in (I hope) close future.

gtrabanco commented 3 months ago

If all that is too much, maybe you could get the info by parsing the site:

You can provide a station and it will provide next train schedule.

gtrabanco commented 3 months ago

I found another way you can do it by scraping. But this isn't very useful for feve because there are some services that aren't show and some which are cancelled and show them.

1. All stations

No auth, just a GET petition:

https://www.adifaltavelocidad.es/inicio?p_p_id=es_adif_portlet_ajaxsearch_form_AjaxSearchFormPortlet_INSTANCE_RXX6uwDpW4sM&p_p_lifecycle=2&p_p_state=normal&p_p_mode=view&p_p_resource_id=%2Fcommon%2Fget-estaciones&p_p_cacheability=cacheLevelPage&_es_adif_portlet_ajaxsearch_form_AjaxSearchFormPortlet_INSTANCE_RXX6uwDpW4sM_forcedGroupId=20124

2. Station overwiew

POST with station id in the origen field:

fetch("https://www.adifaltavelocidad.es/inicio?p_p_id=es_adif_portlet_ajaxsearch_form_AjaxSearchFormPortlet_INSTANCE_RXX6uwDpW4sM&p_p_lifecycle=1&p_p_state=normal&p_p_mode=view&_es_adif_portlet_ajaxsearch_form_AjaxSearchFormPortlet_INSTANCE_RXX6uwDpW4sM_javax.portlet.action=%2Fdefault%2Fsearch&p_auth=1vqTUVbM", {
  "headers": {
    "content-type": "application/x-www-form-urlencoded",
    "sec-ch-ua": "\"Brave\";v=\"125\", \"Chromium\";v=\"125\", \"Not.A/Brand\";v=\"24\"",
    "sec-ch-ua-mobile": "?0",
    "sec-ch-ua-platform": "\"macOS\"",
    "upgrade-insecure-requests": "1",
    "Referer": "https://www.adifaltavelocidad.es/",
    "Referrer-Policy": "strict-origin-when-cross-origin"
  },
  "body": "origen=3067758",
  "method": "POST"
});

Response with Location header to the station url:

Location: https://www.adif.es/-/60907-novelda-aspe?tipoBusqueda=proximasSalidas&trafficType=cercanias&pageFromPlid=335