Closed d451gon closed 1 year ago
Hi @d451gon,
The main changes are fixes for path handling in the caching code.
For an advanced feature, improvements were with regard to multi-node caching. We added a scenario-balancing scheme. Instead of distributing by logs, we distribute by scenarios. This happens in two steps. First, every node builds the scenarios from the logs distributed to them. The generated scenarios tokens and logs it belongs to are then saved to a metadata file. Second, the scenarios list is then chunked and redistributed to each node for feature computation. This results in a much better distribution of computing. We are still testing this out. Hence, the caveat is that you will have to change this line of code to DistributedMode.SCENARIO_BASED
to enable it.
We released about 1TB of data of sensor data. There's more to come :)
@patk-motional Thanks for the 1TB sensor data. I want to know how much disk space is needed for 10% of the sensor data. So that I can buy disks in advance.
Hi @yenw,
We will release a total of roughly 15TB of sensor data. The sensor data covers 10% of the 1300h logged data. Hence, you will have approximately 130h of sensor data.
Hi,
Could you please provide some brief insights on what has been changed and improved regarding feature caching?
Thanks!
P.S. Excellent news that you managed to release a portion of the sensor data!