sustainableaviation / demandmap

✈️🌐 Map of Global Air Transport (with Future Demand)
BSD 3-Clause "New" or "Revised" License
0 stars 1 forks source link

Use FlightRadar24 etc. to get additional Data #24

Open michaelweinold opened 4 months ago

michaelweinold commented 4 months ago

Since AeroDataBox alone likely won't provide enough information to estimate the number of passenger on specific routes, we can use data from FlightRadar24 (or other sourcea) in addition.

Conveniently, there is already a Python package wrapping the API: FlightRadarAPI I am feeding data to the site, so I have an active Business Subscription. This is required for API access. However, it seems from the documentation that they don't have historical data in the API at the moment. If this really is the case, we might need to look for data "manually" in the FlightRadar24 data archive.

There is also pyflightdata and the ADSBExchange API.

@dodedic, I suggest you contact the ADSBExchange team about the possiblity of receiving a data dump of historical data (perhaps for the case-studies):

Screenshot 2024-05-12 at 05 24 56
dodedic commented 4 months ago

FYI: E-Mail has been sent to them, still waiting on an answer.

dodedic commented 4 months ago

@arebe337 About the PAX numbers on all the routes, we could also take the approach of estimating the average available seats on a all routes using FlightRadar24 like I did here. We take the aircraft types from FR24 and average the available seats, then assuming an average load factor we could derive an estimate for number of PAX.

This approach has several implications:

While ADB has a API call to get the aircraft types departing from an airport and on what routes, it is limited to the current date. This can actually be done with a separate Tier 3 call in ADB!! more info in the comment below.

arebe337 commented 4 months ago

@dodedic what would we get from this Tier 3 call?

dodedic commented 4 months ago

So I could provide a table of data @arebe337 , just like the one @michaelweinold suggested here.

Where we have the "average" available seats for each route in this format. Then using your script which uses the amount of flights we can put it together to give us the estimate of PAX. Like I mentioned above we benchmark the data using routes where we know the annual PAX traffic.

The API call I would use is this one. It's a Tier 3 request which we have 150'000 of still. In this request get the destination and the aircraft type (but crucially not the amount of flights...so the work done so far is still important).

I will first make a list of all aircraft types and attach an average available seats to it. I would then for all 3'144 airports take the departing flights only and average across all departing flights to that destination the number of seats available. This I would do for 7 days for 3 weeks (1 call = 12 hours so 2 calls = 1 day) with 3144'2 calls'7 days=44016 calls for 1 week. So we could do 3 weeks of the year for all airports with 132'048 calls of 150'000 available.

dodedic commented 4 months ago

We could also trial this approach by using one specific route again first, if the Lagos-Abuja example from here is not convincing enough.

@michaelweinold Would love to hear your input on this method/estimation before I get to coding and making API calls.

arebe337 commented 4 months ago

So, @dodedic, are you referring to the 'Flight status' call in this scenario? The link you provided directs to the 'FIDS (airport departures and arrivals) - by relative time / by current time' call. I'm unsure where that one would display the type of aircraft.

But that approach could indeed give us a reliable approximation!

dodedic commented 4 months ago

Actually it is the 'FIDS (airport departures and arrivals) - by relative time / by current time' call! It gives this response, with 473 departures in this case. Under "airport" is the destination from LSZH and under "aircraft" is the type.

image

dodedic commented 4 months ago

If we then use this call. This would give us the exact number of seats on that flight. This is a Tier 1 call, of which we have 200'000. Now let's say we request data for this one aircraft with this registration and we store it as well. Then if we see that on a next flight it's again this aircraft, we simply use the stored value and not make a new API request. This would avoid us burning through too many API requests.

With the current worldwide airliner fleet size around 20'000-30'000 aircraft this should work out.

image

arebe337 commented 4 months ago

Alright, that sounds like a plan! I would be really happy if you could do this part:) Currently, I'm working on preparing everything with the GDP sheet so we can seamlessly integrate it into your code for scaling

dodedic commented 4 months ago

Ofcourse! I will do it gladly 😄 Just want to make sure the approach is all good with Michael first.

dodedic commented 4 months ago

Possible issue I see: So let's say I take one week in August, January and May to get a good spread of data for high and low season. There might be flight routes in your file, that won't be flown in the 3 weeks that I select. This could affect smaller airports I would say. Do we see this as a big issue?

arebe337 commented 4 months ago

I agree, it shouldn't pose a major issue. However, we do need to consider our approach for handling such cases. We should definitely identify the month with the highest number of different connections and prioritize that month. But I also agree, taking into account different seasons could also be beneficial for a more comprehensive analysis.

michaelweinold commented 4 months ago

@dodedic:

I will first make a list of all aircraft types and attach an average available seats to it. https://github.com/sustainableaviation/demandmap/issues/24#issuecomment-2114889287

You can use the table I created with my last master's student: https://github.com/sustainableaviation/Aircraft-Performance/blob/main/Databank.xlsx

...but as per your more recent comment https://github.com/sustainableaviation/demandmap/issues/24#issuecomment-2114943738, it seems that you can go though all active aircraft and just get the exact number of seats.

Actually it is the 'FIDS (airport departures and arrivals) - by relative time / by current time' call! https://github.com/sustainableaviation/demandmap/issues/24#issuecomment-2114936898

So in this example, you have 473 departures, all(?) of which are from a single aircraft HB-JJK?

The API call I would use is this one. It's a Tier 3 request which we have 150'000 of still. In this request get the destination and the aircraft type (but crucially not the amount of flights...so the work done so far is still important). https://github.com/sustainableaviation/demandmap/issues/24#issuecomment-2114889287

So the GetAirportFlightsRelative call returns what exactly? The aircraft registration for every aircraft departing the airport? What do you mean by "In this request get the destination and the aircraft type (but crucially not the amount of flights...so the work done so far is still important)."?

dodedic commented 4 months ago

...but as per your more recent comment #24 (comment), it seems that you can go though all active aircraft and just get the exact number of seats.

Exactly, we get all the active aircraft.

So in this example, you have 473 departures, all(?) of which are from a single aircraft HB-JJK?

Actually we get all departures from LSZH in the selected timeframe, one of which happened to be the HB-JJK. All other flights are from other airlines and other aircraft.

So the GetAirportFlightsRelative call returns what exactly? The aircraft registration for every aircraft departing the airport? What do you mean by "In this request get the destination and the aircraft type (but crucially not the amount of flights...so the work done so far is still important)."?

It returns (among other information) the destination of each flight departing the airport within the specified timeframe, as well as the exact aircraft registration and type of aircraft as seen in this comment's screenshot. This means we can either take the general aircraft type and estimate via your compiled list. Or we make an API call with ADB as mentioned above in order to get the exact number of seats on that flight.

"In this request get the destination and the aircraft type (but crucially not the amount of flights...so the work done so far is still important)."?

This meant that the GetAirportFlightsRelative call crucially doesn't include the amount of average daily flights from an airport, so those Tier 3 calls we made so far using the Statistical API were still necessary. That was just a note.

dodedic commented 4 months ago

I am currently updating the Mermaid diagram to visualize this process!