Closed xoolive closed 2 months ago
The proper way to add a method to LazyTraffic
is to add a decorator to a method without an implementation (use the ellipsis ...
) in the Traffic
class, like everything below this line:
Note that the decorator will look for a method in Flight with the same name:
At the moment only two main categories of Flight methods are supported:
True
, the input flight is passed as is to the next method, if False, the processing for the current flight stops;Optional[Flight]
: if None
, the processing for the current flight stops, if a Flight
, the output flight is passed to the next method on the stack.summary()
which returns a dictionary but it can only be used at the end of the stack: eval()
will create a pd.DataFrame
out of all the records. There is nothing in the code to ensure that no other method is stacked after it.The most important category that is not supported as is are methods returning a FlightIterator
: since there can be many segments returned, this would break the linearity of the stack.
Instead of
flight:
|_ method1 -> bool
|_ method2 -> None | Flight
|_ method3 -> None | Flight
It could create a tree of executions, with possible RecursionError associated, and without the possibility to merge branches later:
flight
|_ method1 -> bool
|_ method2 -> FlightIterator
|_ [segment1]
| |_ method3 -> FlightIterator
| |_ ...
|_ [segment2]
| |_ method3 -> FlightIterator
| |_ ...
A common use case with FlightIterator
however is to use an aggregator right behind (i.e. a function FlightIterator -> None | bool | Flight
)
That pattern is supported in LazyTraffic
, with many aggregators such as has
, next
, or all
, they also support methods being passed as a string to avoid passing lambdas which cannot be serialized (and multiprocessed), that's why instead of:
flight.aligned_on_ils("EHRD").next() # None | Flight
traffic.aligned_on_ils("EHRD").next().eval() # INVALID code
^__ returns a FlightIterator
^__ brings the FlightIterator back to a Flight
You would write:
traffic.next('aligned_on_ils("EHRD")').eval()
^__ next returns an Optional[Flight]
t = (
t.iterate_lazy(iterate_kw=dict(by="6 hours"))
.assign_id(name="{self.callsign}_{idx:>04}") # you assign a flight_id
.filter()
.longer_than("1 hour")
.pipe(enough_data)
.split("30min", condition=no_split_enroute) # this returns a FlightIterator
.pipe(full_flight)
)
The problem here that needs clarification is that when you split the trajectories, all the pieces will have the same flight_id. I am not sure if it is what you want. You may have wanted to put the assign_id()
at the end of the stack? It feels error-prone though. I am also not sure what the full_flight
method does.
That's the reason I am not very sure if I want to open the Pandora box with lazy flight iterators, because of this kind of situation.
There is however an example of a workaround in the following example, with the all()
aggregator, followed by a .drop(columns=['flight_id'])
or here where a new flight_id in generated in the all()
aggregator.
For your situation, I am not sure:
full_flight
function? (are the types valid in the first place?).all('split("30min", condition=no_split_enroute)')
would work. It would be probably relevant to implement it if it doesn't.This is what I came up with in the end. Not pretty, but it work well.
def no_split_enroute(f1: Flight, f2: Flight) -> bool:
return f1.data.iloc[-1].altitude < 25000 or f2.data.iloc[0].altitude < 25000
def resample_filter_split(flight: Flight) -> None | Flight:
flight_ = Traffic.from_flights(
[f.resample("2s").filter() for f in flight.split("10min")]
)[0] # 🌶️ I would recommend writing it Flight(pd.concat([...])) to avoid confusion with the [0] here
return flight_.split("30min", condition=no_split_enroute).all(
"{self.flight_id}_{i}"
) # 🌶️ This would be a perfect use case for flight_i.all('split("30min", condition=no_split_en_route)', flight_id="{self.flight_id}_{i}")
def full_flight(flight: Flight) -> None | Flight: # 🌶️ return a bool
if ( # 🌶️ replace 'if' with 'return'
flight.data.altitude.iloc[0] < 5000
and flight.data.altitude.iloc[-1] < 5000
and flight.altitude_max > 15000
):
return flight # 🌶️ remove
return None # 🌶️ remove
t = (
t.iterate_lazy(iterate_kw=dict(by="6 hours"))
.assign_id(name="{self.callsign}_{idx:>04}")
.filter()
.longer_than("1 hour")
.pipe(resample_filter_split)
.pipe(full_flight)
.eval(max_workers=8, desc="processing")
)
Commenting directly in your code above
You can now replace
flight_ = Traffic.from_flights(
[f.resample("2s").filter() for f in flight.split("10min")]
)[0]
with
flight_ = flight.split('10min').map(lambda f: f.resample('2s').filter()).all()
Yes, it works! I think it is good to be merged :+1:
Can we get this method into LazyTraffic?
For example, the following code through error:
Method 'split' not implemented on Traffic or LazyTraffic