xoolive / traffic

A toolbox for processing and analysing air traffic data
https://traffic-viz.github.io/
MIT License
361 stars 80 forks source link

Improve support for SCAT dataset #437

Closed niclaswue closed 3 months ago

niclaswue commented 4 months ago

As discussed here I added the engine flag to the eval calls. While I was at it, I also implemented some functions for loading the waypoints and weather data. Existing code should not break as I added the additional flags include_weather and include_waypoints and set them to False by default. Both weather and waypoints are just dataframes right now because there is some information missing from the waypoints which would be needed to make them Navaid objects. The weather data is decoded grib format which is different from the implemented metar data format.

Let me know what you think! Thanks

xoolive commented 4 months ago

Thank you so much,

With reflexion, I see another alternative here: how about changing the .eval(..., engine="python") with an .assign(... = lambda df: ..., ) ? What do you find is more clear? I am more inclined to prefer the lambda option: eval is great, but if we need the engine kwarg, it is not true anymore.

Also about the test, do you mind writing something slightly more explicit for weather and waypoints, like the presence of some column in the dataframe? or an equality for the max of some value in a given column? This helps to ensure we get the proper columns we expect with these methods...

niclaswue commented 3 months ago

Yes, good call. I think assign was meant for this. I believe it would make error messages and thus debugging much easier.

Sure, I can write some better tests when I’m back on office on Tuesday.

niclaswue commented 3 months ago

I added some real test cases now. However, I wondered if just returning the waypoints as a large dataframe is really the best option, here. We do not have information other than latitude, longitude, name in the dataset but it would be nice to have a list of Navaid objects which in turn could be used for parsing the flight plans, if that is possible?

Additionally, SCAT provides data to describe the airspace which could be parsed as well, I think. Let me know if you would want to keep this all in one PR or split it into multiple PRs.

niclaswue commented 3 months ago

I dove a bit deeper into the navaids and rewrote it so that the scat dataset holds the points as a Navaids object. This way the user could do something like this:

from traffic.data import navaids
from traffic.data.datasets.scat import SCAT

print(navaids.get("VISI"))  # None
print(navaids.get("ZR1"))  # None

scat = SCAT("scat20161112_20161118.zip", nflights=50, include_waypoints=True)
navaids.alternatives["SCAT"] = scat.waypoints

print(navaids.get("VISI"))  # Navaid('VISI', type='FIX', latitude=58.1, longitude=14.4)
print(navaids.get("ZR1"))  # Navaid('ZR1', type='FIX', latitude=59.0, longitude=16.0)

in my opinion, this would be quite clean and more useful as it also helps with parsing the flight plans.

xoolive commented 3 months ago

Thanks!

xoolive commented 3 months ago

I just fixed the tests :scream: Also maybe it would make sense to append .to_xarray() for the weather dataframe, but on the other hand it is an optional dependency so... :shrug: maybe not...

niclaswue commented 3 months ago

Hey there, thanks for merging! Sorry, the tests were still for the dataframe implementation, forgot to mention that in my comment 🙄. I am not familiar with xarray and its advantages, if it is only a .to_xarray() call, I believe it would also be ok to leave it to the user to convert it if necessary.