Open muelleram opened 9 months ago
Full temporal flexibility is possible in the way the interpolation between timestamps is implemented, using timedelta in seconds. However, it's not possible to go to a lower resolution than months at the moment, because the new ids still need to be unique and if we extend an existing id, e.g. 245, with a the timestamp of the temporal copy, eg. 20240123 for 23.01.2024, we have a new id of 24520240123, which is too large to be stored in C long as a 32-bit, see error. We need to add the same time resolution as our temporal grouping so that the ids are still unique.
Maybe we need to change to a int64 id. Or you guys know of a better work around?
Another issue with the current implementation is that the aggregated process copies get assigned to the first timestamp within the aggregated time period in the timeline_df. E.g. for temporal resolution = year: P1 at 2024-12-31 and P2 at 2024-08-31 get grouped within one 2024 row and this row is reassigned to date= 2024-01-01. Correct would be to get some kind of (weighted) temporal average for the date that takes the timing of P1 and P2 into account but I wasn't able to achieve this with groupby and datetime/timedelta.
see code medusa_tools, l-115 ff
Ok, I checked with Chris and currently this indeed hardcoded to be an int32. Might be possible to change this but I suppose we could do a quick mapping as well, which might be faster.
What do we want to be able to save year-month-day-hour. I guess the season as well as discussed before, but it is already contained in the month.
Currently, t(link) (temporal aggregation of new process copies) is set to 1 year. It would be great to let the users define the level of aggregation themselves, possibly within a reasonable temporal range. E.g. seconds/minutes might not be suitable as this would lead to a large quantity of new processes in the database and does not fit with common LCA problems.