1a1a11a / libCacheSim

a high performance library for building cache simulators
GNU General Public License v3.0
155 stars 33 forks source link

Trace analyzer space usage #29

Closed fedorova closed 2 months ago

fedorova commented 11 months ago

Hi folks, I am using the trace analyzer for reuse distance on a 371MB trace. The analyzer keeps running, but can never finish, because it runs out of disk space. For example, last time I tried my .reuseWindow_w300_rt file grew to 91GB, filling out the remaining space in my file system. At that point, trace analyzer hung, not quitting or producing any info messages.

Is it normal to generate 91GB+ of .reuseWindow_w300_rt for a 371MB trace? Is there a way to run the trace analyzer differently so that it actually completes?

1a1a11a commented 11 months ago

no, it is not common, can you show the command you use?

1a1a11a commented 11 months ago

A few other comments:

  1. the traceAnalyzer was merged from my other work recently and did not have any test, so it may have bugs, sorry...
  2. if you just need reuse distance, then you can disable the window-based calculation by using --common instead of --reuse
  3. the reuse distance calculation in traceAnalyzer is not "stack distance", it is the number of requests / seconds between two accesses of an object.
  4. If you need stack distance calculation, we have a tool called distUtil, and you can try ./bin/distUtil ../data/trace.vscsi vscsi stack_dist txt trace. More usage can be found using --help.

I hope this helps.

fedorova commented 11 months ago

I ran the following command:

traceAnalyzer <trace_path> csv -t "time-col=1, obj-id-col=2, obj-size-col=3" --all

But if I used --reuse instead of --all, I saw the same pattern.

The command completed on a 308MB trace after generating a 132GB file <>..csv.reuseWindow_w300_rt, but the <>.csv.reuse file has zero bytes in it, as do .csv.accessRtime, .csv.accessVtime, .csv.popularity and .csv.size

1a1a11a commented 11 months ago

I am not able to reproduce this problem, do you mind sharing a few lines of the input file? a few other suggestions:

  1. The csv reader is not robust and may run into problems if the trace is not well-formatted (e.g., missing delimiter), so try to print the trace using bin/tracePrint to see whether the csv trace is parsed correctly
  2. I was wrong about --reuse, it does generate the reuseWindow file, so just use --common or take a look at the distUtil tool.

:)

fedorova commented 11 months ago

I am not able to reproduce this problem, do you mind sharing a few lines of the input file?

Here are a few lines of my csv file:

1695337878208738,4096,4096 1695337878208830,4096,4096 1695337878208853,4096,4096 1695337878208927,4096,4096 1695337878208942,4096,4096 1695337878208990,4096,4096 1695337878209258,4096,4096 1695337878209452,4096,4096 1695337878209471,4096,4096 1695337878209482,4096,4096 1695337878209580,4096,4096 1695337878209650,4096,4096 1695337878209774,4096,4096 1695337878209851,4096,4096 1695337878209866,4096,4096 1695337878209882,4096,4096 1695337878209928,4096,4096 1695337878209939,4096,4096 1695337878209949,4096,4096 1695337878209959,4096,4096 1695337878209970,4096,4096 1695337878210009,4096,4096 1695337878222642,4096,4096 1695337878275331,1237467136,4096 1695337878275356,1237467136,4096 1695337878275384,1236574208,4096

fedorova commented 11 months ago

Here is the entire gzipped trace: https://people.ece.ubc.ca/~sasha/TMP/evict-btree.csv.gz

1a1a11a commented 10 months ago

Thank you for sharing the trace!

  1. large output The large output is caused by long time range. The default time unit is second and time window is 300 sec. The trace has 120592826 seconds (I guess the time unit is not second?), which causes the number of window too large and thus the large output. I would suggest changing the time unit to second, or change the time_window to a large value, or just ignore this computation.

  2. incorrect results Using the debug mode (cmake_build_type=Debug), the binary crashes at https://github.com/1a1a11a/libCacheSim/blob/82e76bd19e98627e6babfc0cced17b4ae32e824e/libCacheSim/traceAnalyzer/analyzer.cpp#L117 which suggests that the trace is not time-ordered, for example, the following lines are not ordered.

1695337878371012,1236004864,4096
1695337878371038,1236008960,4096
1695337878371030,671416320,28672
1695337878371045,112283648,28672
1695337878371071,112398336,28672

After sorting the data, the analysis can finish without any issue.

One minor suggestion, when using csv traces, if the object id (e.g., block address) is numeric, adding obj-id-is-num=1 to the trace type options will reduce memory usage and run time.

I hope this helps. Thank you for reporting the issue!

1a1a11a commented 2 months ago

I will close the issue for now, but feel free to open it if you believe there is anything I can do to help.