bespoke-silicon-group / accelerator-debugger

A framework for building application-level debuggers for bleeding-edge hardware systems.
3 stars 0 forks source link

Speed up loading of VCD file #15

Closed dpetrisko closed 5 years ago

dpetrisko commented 5 years ago

(I haven't looked too close at the code so CMIIW). Could we trim the VCD file to only include the signals that we're interested in? We would iterate through the signal lists of all included models and delete all other signals from the VCD

taylor-bsg commented 5 years ago

The other approach would be to add a C backend for maintaining the VCD database. This would speed a lot of things up.

Aside: I wonder how much of VCD files are just clocks toggling, and how much could be saved just by having a notion of a periodic signal.

M

On Fri, Aug 2, 2019 at 9:13 AM Dan Petrisko notifications@github.com wrote:

(I haven't looked too close at the code so CMIIW). Could we trim the VCD file to only include the signals that we're interested in? We would iterate through the signal lists of all included models and delete all other signals from the VCD

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/bespoke-silicon-group/accelerator-debugger/issues/15?email_source=notifications&email_token=AEFG5ADJNQZIGV6TRBDVR5LQCRMLDA5CNFSM4II7FDM2YY3PNVWWK3TUL52HS4DFUVEXG43VMWVGG33NNVSW45C7NFSM4HDDC57Q, or mute the thread https://github.com/notifications/unsubscribe-auth/AEFG5ACX7QXJZDT6EWNURGDQCRMLDANCNFSM4II7FDMQ .

mara-kr commented 5 years ago

This is what happens now -- we only pass model.signal_names to be generated into a VCDData structure. This is then stored as JSON in $(DATA_FILE).cached, which should considerably speed up the loading process after the first pass. At some point, this is all just bound by disk-read speeds. If a VCD file is on the order of GB, it'll take a decent amount of time, regardless of whether backend is C, Pypy, or pure python.

I think (from a cursory look) most of VCD files aren't just clocks toggling, but a fair amount of the VCD file is fan{out|in} signals that we don't particularly care about.

taylor-bsg commented 5 years ago

Maybe there is a gunzip (or better yet, xz) front end for python that we can use? Then we can store the vcd file as an xz.

m

On Fri, Aug 2, 2019 at 10:23 AM Neil Ryan notifications@github.com wrote:

This is what happens now -- we only pass model.signal_names to be generated into a VCDData structure. This is then stored as JSON in $(DATA_FILE).cached, which should considerably speed up the loading process after the first pass. At some point, this is all just bound by disk-read speeds. If a VCD file is on the order of GB, it'll take a decent amount of time, regardless of whether backend is C, Pypy, or pure python.

I think (from a cursory look) most of VCD files aren't just clocks toggling, but a fair amount of the VCD file is fan{out|in} signals that we don't particularly care about.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/bespoke-silicon-group/accelerator-debugger/issues/15?email_source=notifications&email_token=AEFG5ACRMBGMHUJIJWZAXGTQCRUQRA5CNFSM4II7FDM2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD3OLRVI#issuecomment-517781717, or mute the thread https://github.com/notifications/unsubscribe-auth/AEFG5AEQKVGY3MBEFI2VJLTQCRUQRANCNFSM4II7FDMQ .

mara-kr commented 5 years ago

Do you mean storing the entire VCD dump as an xz? Or the minimized VCD dump as an xz? For the former, vpd2vcd generates a VCD dump, a 3.3GB VCD file took 14min to compress, versus about a minute every time the cached file needs to be regenerated -- I don't know if it's worth it to store the VCD dump as an xz. The minimized dump loads quickly enough (around 1s) that it feels like premature optimization; I'd wait for it to be a pain point for users.

taylor-bsg commented 5 years ago

I think the question is how long does xz take to decompress and what is the degree of compression that is attained? The compression is to reduce the pain on our file system. =)

M

On Tue, Aug 20, 2019 at 9:50 AM Neil Ryan notifications@github.com wrote:

Do you mean storing the entire VCD dump as an xz? Or the minimized VCD dump as an xz? For the former, vpd2vcd generates a VCD dump, a 3.3GB VCD file took 14min to compress, versus about a minute every time the cached file needs to be regenerated -- I don't know if it's worth it to store the VCD dump as an xz. The minimized dump loads quickly enough (around 1s) that it feels like premature optimization; I'd wait for it to be a pain point for users.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/bespoke-silicon-group/accelerator-debugger/issues/15?email_source=notifications&email_token=AEFG5AEWYQSK3QW25OASKX3QFQOGPA5CNFSM4II7FDM2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD4W6DNQ#issuecomment-523100598, or mute the thread https://github.com/notifications/unsubscribe-auth/AEFG5AFOXGMF65EVFTLRDELQFQOGPANCNFSM4II7FDMQ .

mara-kr commented 5 years ago

So, on a single test of a 3.3GB file (the same one for the 14min test), it took 1:57 to load the compressed file and 1:25 to load the uncompressed file; the compressed file was 159MB. In hindsight, we're basically building our own VPD file that's open source -- I'll push the support for loading XZ files in.

dpetrisko commented 5 years ago

Could also look into FST?

https://www.systutorials.com/docs/linux/man/1-vcd2fst/

On Tue, Aug 20, 2019 at 3:47 PM Neil Ryan notifications@github.com wrote:

So, on a single test of a 3.3GB file (the same one for the 14min test), it took 1:57 to load the compressed file and 1:25 to load the uncompressed file; the compressed file was 159MB. In hindsight, we're basically building our own VPD file that's open source -- I'll push the support for loading XZ files in.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/bespoke-silicon-group/accelerator-debugger/issues/15?email_source=notifications&email_token=AARW6WWUX3HFAG7HNE7TUCTQFRC6VA5CNFSM4II7FDM2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD4XONEA#issuecomment-523167376, or mute the thread https://github.com/notifications/unsubscribe-auth/AARW6WVMCB6YBX7GRVUSEEDQFRC6VANCNFSM4II7FDMQ .

mara-kr commented 5 years ago

FST's an option -- part of why VCD files are nice is that there's enough momentum that a parser's been written in most languages. gtkwave's FST parser is 2k lines of C, I'm sure it'd be good to have a port at some point, but that's definitely a fair bit of work.

mara-kr commented 5 years ago

f6d2ce8 has XZ support, #18 for FST support