mfdz / GTFS-Issues

Documentation and Tracking of Issues in GTFS- and GTFS-RT Feeds
36 stars 3 forks source link

opendata-oepnv Permalink-Mechanismus unbrauchbar #111

Open hbruch opened 1 year ago

hbruch commented 1 year ago

Seit einiger Zeit bietet https://www.opendata-oepnv.de/ einen Permalink-Mechanismus an, um Datensätze unter gleichbleibender URL herunterladen zu können.

Angemeldete Nutzende können über grafik einen Download-Link generieren, über den bis zu 5 mal pro Woche der Datensatz abrufbar sein solle.

Die Umsetzung erfolgt offenbar über ein 303-Redirect und scheint die Existenz von Cookies vorauszusetzen. Ein automatisierter Download über curl liefert statt des Datensatzes eine HTML-Seite zurück.

Darüber hinaus wird anstatt in der Folge statt einer Rückmeldung eines 403 Forbidden Fehlers eine Webseite mit Return Code 200 zurückgeliefert. Dies entspricht nicht gängigen Web-Standards.

> GET /index.php?id=1384&tx_vrrkit_view%5Bsharing%5D=<user-specific-token>A2fQ%3D%3D&tx_vrrkit_view%5Baction%5D=download&tx_vrrkit_view%5Bcontroller%5D=View&tx_vrrkit_view%5Bformat%5D=zip HTTP/1.1
> Host: www.opendata-oepnv.de
> User-Agent: curl/7.79.0
> Accept: */*
>
* Mark bundle as not supporting multiuse
< HTTP/1.1 303 See Other
< Date: Mon, 19 Dec 2022 09:32:06 GMT
< Server: Apache/2.4.6 (CentOS) OpenSSL/1.0.2k-fips PHP/7.2.34 mod_wsgi/3.4 Python/2.7.5
< X-Powered-By: PHP/7.2.34
< Set-Cookie: fe_typo_user=wasverified
< Location: https://www.opendata-oepnv.de/fileadmin/datasets/delfi/20221219_zHV_gesamt.zip
< Cache-Control: max-age=0
< Expires: Mon, 19 Dec 2022 09:32:06 GMT
< X-UA-Compatible: IE=edge
< X-Content-Type-Options: nosniff
< Content-Length: 0
< Content-Type: text/html; charset=UTF-8
<
* Connection #0 to host www.opendata-oepnv.de left intact
* Issue another request to this URL: 'https://www.opendata-oepnv.de/fileadmin/datasets/delfi/20221219_zHV_gesamt.zip'
* Found bundle for host www.opendata-oepnv.de: 0x600001db0de0 [serially]
* Can not multiplex, even if we wanted to!
* Re-using existing connection! (#0) with host www.opendata-oepnv.de
* Connected to www.opendata-oepnv.de (109.75.188.38) port 443 (#0)
> GET /fileadmin/datasets/delfi/20221219_zHV_gesamt.zip HTTP/1.1
> Host: www.opendata-oepnv.de
> User-Agent: curl/7.79.0
> Accept: */*
>
* Mark bundle as not supporting multiuse
< HTTP/1.1 302 Found
< Date: Mon, 19 Dec 2022 09:32:06 GMT
< Server: Apache/2.4.6 (CentOS) OpenSSL/1.0.2k-fips PHP/7.2.34 mod_wsgi/3.4 Python/2.7.5
< Location: https://www.opendata-oepnv.de/?backlink=/fileadmin/datasets/delfi/20221219_zHV_gesamt.zip
< Cache-Control: max-age=0
< Expires: Mon, 19 Dec 2022 09:32:06 GMT
< Content-Length: 273
< Content-Type: text/html; charset=iso-8859-1
<
* Ignoring the response-body
* Connection #0 to host www.opendata-oepnv.de left intact
* Issue another request to this URL: 'https://www.opendata-oepnv.de/?backlink=/fileadmin/datasets/delfi/20221219_zHV_gesamt.zip'
* Found bundle for host www.opendata-oepnv.de: 0x600001db0de0 [serially]
* Can not multiplex, even if we wanted to!
* Re-using existing connection! (#0) with host www.opendata-oepnv.de
* Connected to www.opendata-oepnv.de (109.75.188.38) port 443 (#0)
> GET /?backlink=/fileadmin/datasets/delfi/20221219_zHV_gesamt.zip HTTP/1.1
> Host: www.opendata-oepnv.de
> User-Agent: curl/7.79.0
> Accept: */*
>
* Mark bundle as not supporting multiuse
< HTTP/1.1 307 Temporary Redirect
< Date: Mon, 19 Dec 2022 09:32:06 GMT
< Server: Apache/2.4.6 (CentOS) OpenSSL/1.0.2k-fips PHP/7.2.34 mod_wsgi/3.4 Python/2.7.5
< X-Powered-By: PHP/7.2.34
< location: /ht/de/willkommen?backlink=%2Ffileadmin%2Fdatasets%2Fdelfi%2F20221219_zHV_gesamt.zip&cHash=c8b592ad33430fe1f867646610870a4e
< Cache-Control: max-age=0
< Expires: Mon, 19 Dec 2022 09:32:06 GMT
< X-UA-Compatible: IE=edge
< X-Content-Type-Options: nosniff
< Content-Length: 0
< Content-Type: text/html; charset=UTF-8
<
* Connection #0 to host www.opendata-oepnv.de left intact
* Issue another request to this URL: 'https://www.opendata-oepnv.de/ht/de/willkommen?backlink=%2Ffileadmin%2Fdatasets%2Fdelfi%2F20221219_zHV_gesamt.zip&cHash=c8b592ad33430fe1f867646610870a4e'
* Found bundle for host www.opendata-oepnv.de: 0x600001db0de0 [serially]
* Can not multiplex, even if we wanted to!
* Re-using existing connection! (#0) with host www.opendata-oepnv.de
* Connected to www.opendata-oepnv.de (109.75.188.38) port 443 (#0)
> GET /ht/de/willkommen?backlink=%2Ffileadmin%2Fdatasets%2Fdelfi%2F20221219_zHV_gesamt.zip&cHash=c8b592ad33430fe1f867646610870a4e HTTP/1.1
> Host: www.opendata-oepnv.de
> User-Agent: curl/7.79.0
> Accept: */*
>
* Mark bundle as not supporting multiuse
< HTTP/1.1 200 OK
< Date: Mon, 19 Dec 2022 09:32:06 GMT
< Server: Apache/2.4.6 (CentOS) OpenSSL/1.0.2k-fips PHP/7.2.34 mod_wsgi/3.4 Python/2.7.5
< X-Powered-By: PHP/7.2.34
< Content-Language: de
< Cache-Control: private, no-store, max-age=0
< Content-Length: 35072
< Vary: Accept-Encoding
< Expires: Mon, 19 Dec 2022 09:32:06 GMT
< X-UA-Compatible: IE=edge
< X-Content-Type-Options: nosniff
< Content-Type: text/html; charset=utf-8
<
<!DOCTYPE html>
<html dir="ltr" lang="de">
<head>

<meta charset="utf-8">
<!--
    This website is powered by TYPO3 - inspiring people to share!
    TYPO3 is a free open source Content Management Framework initially created by Kasper Skaarhoj and licensed under GNU/GPL.
    TYPO3 is copyright 1998-2022 of Kasper Skaarhoj. Extensions are copyright of their respective owners.
    Information and contribution at https://typo3.org/
-->> GET /index.php?id=1384&tx_vrrkit_view%5Bsharing%5D=eyJkYXRhc2V0IjoiZGV1dHNjaGxhbmR3ZWl0ZS1oYWx0ZXN0ZWxsZW5kYXRlbiIsInVzZXJJZCI6MTA2fQ%3D%3D&tx_vrrkit_view%5Baction%5D=download&tx_vrrkit_view%5Bcontroller%5D=View&tx_vrrkit_view%5Bformat%5D=zip HTTP/1.1
> Host: www.opendata-oepnv.de
> User-Agent: curl/7.79.0
> Accept: */*
>
* Mark bundle as not supporting multiuse
< HTTP/1.1 303 See Other
< Date: Mon, 19 Dec 2022 09:32:06 GMT
< Server: Apache/2.4.6 (CentOS) OpenSSL/1.0.2k-fips PHP/7.2.34 mod_wsgi/3.4 Python/2.7.5
< X-Powered-By: PHP/7.2.34
< Set-Cookie: fe_typo_user=wasverified
< Location: https://www.opendata-oepnv.de/fileadmin/datasets/delfi/20221219_zHV_gesamt.zip
< Cache-Control: max-age=0
< Expires: Mon, 19 Dec 2022 09:32:06 GMT
< X-UA-Compatible: IE=edge
< X-Content-Type-Options: nosniff
< Content-Length: 0
< Content-Type: text/html; charset=UTF-8
<
* Connection #0 to host www.opendata-oepnv.de left intact
* Issue another request to this URL: 'https://www.opendata-oepnv.de/fileadmin/datasets/delfi/20221219_zHV_gesamt.zip'
* Found bundle for host www.opendata-oepnv.de: 0x600001db0de0 [serially]
* Can not multiplex, even if we wanted to!
* Re-using existing connection! (#0) with host www.opendata-oepnv.de
* Connected to www.opendata-oepnv.de (109.75.188.38) port 443 (#0)
> GET /fileadmin/datasets/delfi/20221219_zHV_gesamt.zip HTTP/1.1
> Host: www.opendata-oepnv.de
> User-Agent: curl/7.79.0
> Accept: */*
>
* Mark bundle as not supporting multiuse
< HTTP/1.1 302 Found
< Date: Mon, 19 Dec 2022 09:32:06 GMT
< Server: Apache/2.4.6 (CentOS) OpenSSL/1.0.2k-fips PHP/7.2.34 mod_wsgi/3.4 Python/2.7.5
< Location: https://www.opendata-oepnv.de/?backlink=/fileadmin/datasets/delfi/20221219_zHV_gesamt.zip
< Cache-Control: max-age=0
< Expires: Mon, 19 Dec 2022 09:32:06 GMT
< Content-Length: 273
< Content-Type: text/html; charset=iso-8859-1
<
* Ignoring the response-body
* Connection #0 to host www.opendata-oepnv.de left intact
* Issue another request to this URL: 'https://www.opendata-oepnv.de/?backlink=/fileadmin/datasets/delfi/20221219_zHV_gesamt.zip'
* Found bundle for host www.opendata-oepnv.de: 0x600001db0de0 [serially]
* Can not multiplex, even if we wanted to!
* Re-using existing connection! (#0) with host www.opendata-oepnv.de
* Connected to www.opendata-oepnv.de (109.75.188.38) port 443 (#0)
> GET /?backlink=/fileadmin/datasets/delfi/20221219_zHV_gesamt.zip HTTP/1.1
> Host: www.opendata-oepnv.de
> User-Agent: curl/7.79.0
> Accept: */*
>
* Mark bundle as not supporting multiuse
< HTTP/1.1 307 Temporary Redirect
< Date: Mon, 19 Dec 2022 09:32:06 GMT
< Server: Apache/2.4.6 (CentOS) OpenSSL/1.0.2k-fips PHP/7.2.34 mod_wsgi/3.4 Python/2.7.5
< X-Powered-By: PHP/7.2.34
< location: /ht/de/willkommen?backlink=%2Ffileadmin%2Fdatasets%2Fdelfi%2F20221219_zHV_gesamt.zip&cHash=c8b592ad33430fe1f867646610870a4e
< Cache-Control: max-age=0
< Expires: Mon, 19 Dec 2022 09:32:06 GMT
< X-UA-Compatible: IE=edge
< X-Content-Type-Options: nosniff
< Content-Length: 0
< Content-Type: text/html; charset=UTF-8
<
* Connection #0 to host www.opendata-oepnv.de left intact
* Issue another request to this URL: 'https://www.opendata-oepnv.de/ht/de/willkommen?backlink=%2Ffileadmin%2Fdatasets%2Fdelfi%2F20221219_zHV_gesamt.zip&cHash=c8b592ad33430fe1f867646610870a4e'
* Found bundle for host www.opendata-oepnv.de: 0x600001db0de0 [serially]
* Can not multiplex, even if we wanted to!
* Re-using existing connection! (#0) with host www.opendata-oepnv.de
* Connected to www.opendata-oepnv.de (109.75.188.38) port 443 (#0)
> GET /ht/de/willkommen?backlink=%2Ffileadmin%2Fdatasets%2Fdelfi%2F20221219_zHV_gesamt.zip&cHash=c8b592ad33430fe1f867646610870a4e HTTP/1.1
> Host: www.opendata-oepnv.de
> User-Agent: curl/7.79.0
> Accept: */*
>
* Mark bundle as not supporting multiuse
< HTTP/1.1 200 OK
< Date: Mon, 19 Dec 2022 09:32:06 GMT
< Server: Apache/2.4.6 (CentOS) OpenSSL/1.0.2k-fips PHP/7.2.34 mod_wsgi/3.4 Python/2.7.5
< X-Powered-By: PHP/7.2.34
< Content-Language: de
< Cache-Control: private, no-store, max-age=0
< Content-Length: 35072
< Vary: Accept-Encoding
< Expires: Mon, 19 Dec 2022 09:32:06 GMT
< X-UA-Compatible: IE=edge
< X-Content-Type-Options: nosniff
< Content-Type: text/html; charset=utf-8
<
<!DOCTYPE html>
<html dir="ltr" lang="de">
<head>

<meta charset="utf-8">
<!--
    This website is powered by TYPO3 - inspiring people to share!
    TYPO3 is a free open source Content Management Framework initially created by Kasper Skaarhoj and licensed under GNU/GPL.
    TYPO3 is copyright 1998-2022 of Kasper Skaarhoj. Extensions are copyright of their respective owners.
    Information and contribution at https://typo3.org/
-->

...
derhuerst commented 1 year ago

Habe das Problem heute per Mail an den VRR gemeldet.

fahrplaner commented 8 months ago

Mit wget lässt sich der permanente Link nutzen.

derhuerst commented 8 months ago

Mit wget lässt sich der permanente Link nutzen.

Das liegt daran, dass wget die via Set-Cookie-Response-Header gesetzten Cookies nach einer Weiterleitung wieder mitsendet.

curl macht das standardmäßig nicht, sondern nur mit -b ''. Alternative lässt sich mittels -H 'Cookie: fe_typo_user=wasverified; path=/ das vom Server gesetzte Cookie emulieren.

Cookies sind also nach wie vor erforderlich, obwohl eigentlich keine Funktion erfüllen. Wir schlagen vor, einen Zugriff ohne gesetztes Cookie zu erlauben.

derhuerst commented 8 months ago

FYI: @juliuste mirrors the official DELFI GTFS & NeTEx feeds here to allow for less limited machine access: