Error when reading new VBA

CMU-SAFARI / Sibyl

Source code for the software implementation of Sibyl proposed in our ISCA 2022 paper: Gagandeep Singh et. al., "Sibyl: Adaptive and Extensible Data Placement in Hybrid Storage Systems using Online Reinforcement Learning" at https://people.inf.ethz.ch/omutlu/pub/Sibyl_RL-based-data-placement-in-hybrid-storage-systems_isca22.pdf

MIT License

31 stars 6 forks source link

I am trying to run Sibyl with msr-cambridge1-sample.csv (http://iotta.snia.org/traces/block-io/388). I replaced the storage drivers with dummy code that calls time.sleep() to simulate latency.

There is an error with the read() function in hybridstorage.py when a new VBA is being read (line 110 of msr-cambridge1-sample.csv). The read() function only handles the case when VBA is in self._mapping_table.index. This means the VBA must have been written by Sibyl before.

https://github.com/CMU-SAFARI/Sibyl/blob/ab7199f0b7f75710ca6b56870e0f5bd4a33f17eb/sibyl/hybridstorage.py#L274

If the VBA is not in self._mapping_table.index, the latency defaults to 0. This causes a zero division error when computing the reward because self._current_perf is 0.

https://github.com/CMU-SAFARI/Sibyl/blob/ab7199f0b7f75710ca6b56870e0f5bd4a33f17eb/sibyl/hybridstorageenvironment.py#L184

Please advise how should read requests for new VBA be handled. The latency should not default to 0. Thank you.

import glob import pandas as pd from tqdm import tqdm for file in tqdm(glob.glob("MSR-Cambridge/*.csv")): df = pd.read_csv(file) df.columns=range(7) grouped_vals = df.groupby(4)[3].unique() filtered_vals = [index for index, values in grouped_vals.items() if 'Write' not in values or values[0] == 'Read'] df = df[~df[4].isin(filtered_vals)] df = df.iloc[:,3:6][[4,5,3]] df.to_csv(f"fixed_msr/{file.split('/')[1]}", header=False, index=False)

CMU-SAFARI / Sibyl

Error when reading new VBA #6