1a1a11a / libCacheSim

a high performance library for building cache simulators
GNU General Public License v3.0
159 stars 34 forks source link

Unable to run csv traces that have alphabetical object IDs #32

Closed midsterx closed 11 months ago

midsterx commented 1 year ago

Hi,

I have traces of the following format (with alphabetical object IDs):

# timestamp (ms), op type [Put:0, Get:1, Delete:2, Head: 3, Copy: 4], object id, object size (byte)
0,1,A,8371
103,1,B,31
159,1,C,3668
315,1,D,15572
418,1,E,62099
529,1,F,7621
881,1,G,1316252
956,1,H,104330
1026,1,I,4778

And I tried to run this command:

python3 plot_mrc_size.py --tracepath ../../../shared/IBMObjectStorageTrace/IBMObjectStoreTrace018Part0-sliced-compressed --trace-format csv --trace-format-params="time-col=1,obj-id-col=3,obj-size-col=4,delimiter=,,obj-id-is-num=0" --algos=fifo,lru,clock,slru,lfu,lfuda,arc,twoq,gdsf,hyperbolic,lecar,cacheus,lhd,qdlp,s3fifo,sieve --sizes=0.1,0.2,0.3,0.4,0.5,0.6,0.7,0.8,0.9

However, I am faced with the following error:

Traceback (most recent call last):
  File "plot_mrc_size.py", line 229, in <module>
    plot_mrc_size(mrc_dict,
  File "plot_mrc_size.py", line 116, in plot_mrc_size
    first_size = int(list(mrc_dict.values())[0][0][0])
IndexError: list index out of range

Could you give me some pointers as to how I can fix this issue and run my trace?

1a1a11a commented 1 year ago

Hi @midsterx

  1. non-numeric object id is supported
  2. The problem is caused by a bug in lfuda, if you remove this algorithm, then it should work. I have updated the logging to make such errors easier to interpret, but I won't be able to fix the bug soon. If you are interested in lfuda, bug fix is welcome.
midsterx commented 1 year ago

Hey @1a1a11a, I removed lfuda for now and ran the below command:

python3 plot_mrc_size.py --tracepath ../../../shared/IBMObjectStorageTrace/IBMObjectStoreTrace018Part0-sliced-compressed --trace-format csv --trace-format-params="time-col=1,obj-id-col=3,obj-size-col=4,delimiter=,,obj-id-is-num=0" --algos=fifo,lru,clock,slru,lfu,arc,twoq,gdsf,hyperbolic,lecar,cacheus,lhd,qd
lp,s3fifo,sieve --sizes=0.1,0.2,0.3,0.4,0.5,0.6,0.7,0.8,0.9

However, I am faced with the same error:

Traceback (most recent call last):
  File "plot_mrc_size.py", line 229, in <module>
    plot_mrc_size(mrc_dict,
  File "plot_mrc_size.py", line 116, in plot_mrc_size
    first_size = int(list(mrc_dict.values())[0][0][0])
IndexError: list index out of range

Are there other algorithms that result in the bug?

1a1a11a commented 1 year ago

hmm, can you try out and see if other algorithms cause the problem?

1a1a11a commented 1 year ago

oh, I just realized that I forgot to push the commits, can you try again?

midsterx commented 12 months ago

Hey @1a1a11a, I just pulled the latest changes, and the code seems to work (even for LFUDA). Thank you for letting me know!

midsterx commented 12 months ago

However, the code gives the following error when I try to run it on a larger trace:

python3 plot_mrc_size.py --tracepath ../../../shared/IBMObjectStorageTrace/IBMObjectStoreTrace018Part0-sliced-compressed --trace-format csv --trace-format-params="time-col=1,obj-id-col=3,obj-size-col=4,delimiter=,,obj-id-is-num=0" --algos=fifo,lru,clock,slru,lfu,lfuda,arc,twoq,gdsf,hyperbolic,lecar,cacheus,lhd,qdlp,s3fifo,sieve --sizes=0.1,0.2,0.3,0.4,0.5,0.6,0.7,0.8,0.9
18:36:43: INFO [setup_utils.py:57 (setup_utils)]:   BASEPATH: <...>
18:36:43: INFO [setup_utils.py:58 (setup_utils)]:   CACHESIM_PATH: <...>
18:38:57: WARNING [plot_mrc_size.py:69 (plot_mrc_size)]:    cachesim may have crashed with segfault
18:38:57: INFO [plot_mrc_size.py:77 (plot_mrc_size)]:   cachesim log: [INFO]  10-21-2023 18:36:43    csv.c:150  (tid=139990001845632): detect csv trace has header 1
18:38:57: INFO [plot_mrc_size.py:77 (plot_mrc_size)]:   cachesim log: [INFO]  10-21-2023 18:36:43 cli_reader_utils.c:262  (tid=139990001845632): calculating working set size...
18:38:57: INFO [plot_mrc_size.py:77 (plot_mrc_size)]:   cachesim log: [INFO]  10-21-2023 18:37:39 cli_reader_utils.c:279  (tid=139990001845632): working set size: 62653404 object 4570314573187 byte
18:38:57: INFO [plot_mrc_size.py:77 (plot_mrc_size)]:   cachesim log: [INFO]  10-21-2023 18:37:56 cli_parser.c:545  (tid=139990001845632): trace path: ../../../shared/IBMObjectStorageTrace/IBMObjectStoreTrace018Part0-sliced-compressed, trace_type CSV_TRACE, ofilepath IBMObjectStoreTrace018Part0-sliced-compressed.cachesim, 32 threads, warmup -1 sec, total 16 algo x 9 size = 144 caches, fifo, lru, clock, slru, lfu, lfuda, arc, twoq, gdsf, hyperbolic, lecar, cacheus, lhd, qdlp, s3fifo, sieve, trace-type-params: time-col=1,obj-id-col=3,obj-size-col=4,delimiter=,,obj-id-is-num=0
18:38:57: INFO [plot_mrc_size.py:77 (plot_mrc_size)]:   cachesim log: [INFO]  10-21-2023 18:37:57 simulator.c:294  (tid=139990001845632): simulate_with_multi_caches starts computation, num_warmup_req 0, start cache FIFO size 426GiB, end cache Sieve size 4TiB, 144 caches, 32 threads, please wait
18:38:57: INFO [plot_mrc_size.py:77 (plot_mrc_size)]:   cachesim log:
18:38:57: ERROR [plot_mrc_size.py:259 (plot_mrc_size)]:     fail to compute mrc

Is there any restriction on the trace file size @1a1a11a?

1a1a11a commented 12 months ago

no there is no limit on trace file, but you are simulating 32 caches at the same time, so you need to make sure you have enough DRAM (some algorithms require a lot of DRAM), I would suggest reducing the number of threads

midsterx commented 11 months ago

Understood, I'll try that. Thank you!