microsoft / vasim

Enhanced autoscaling with VASIM: Vertical Autoscaling Simulator Toolkit
MIT License
10 stars 7 forks source link

Reading in CSVs with glob #18

Open ksaur opened 2 months ago

ksaur commented 2 months ago

We are currently reading in all of the CSVs of the performance trace data using python's glob (list(self.data_dir.glob("**/*.csv"))) with no security checks. It seems like we need to do a bit more here in terms of sanitizing inputs.

ksaur commented 1 month ago

We need to up the priority of this. We need to make sure we only read in files named as the perf_event_log, else users folder paths will cause the ingestor to fail. We need to fix that, and add appropriate error messages.

ksaur commented 1 month ago

Users will currently get KeyError: "Cannot get left slice bound for non-unique label: Timestamp('2023-04-02 00:09:00')" or something if they have other CSVs unexpected in their folders