jacereda / fsatrace

Filesystem access tracer
ISC License
78 stars 12 forks source link

Remove duplicate adjacent lines #3

Closed ndmitchell closed 8 years ago

ndmitchell commented 8 years ago

Currently, running gcc -c main.c, I get 139 lines output from fsatrace. If I remove all lines which are identical to the previous line, I'm left with 15. Reducing the number of lines by a factor of 10 results in less storage, and less requirement for processing downstream. This will probably help with https://github.com/ndmitchell/shake/pull/334

ndmitchell commented 8 years ago

Removing all duplicates would be one approach, just omitting adjacent duplicates is probably simpler, guarantees to lose less information, and gives most of the gains.

jacereda commented 8 years ago

I'll probably refactor the whole thing to have a common frontend for windows/unix and implement it there. Probably as a flag, since I think clang behaves better.

ndmitchell commented 8 years ago

What are the reasons you might want duplicate adjacent lines? Do they communicate real information?

If you're going for a flag, and it's easy, Shake really only wants each line once, removing all duplicates, adjacent or not - but certainly Shake can do that on it's side.

jacereda commented 8 years ago

Now -- should emit with dups removed and --- should emit the whole thing.

The raw output can be useful to debug fsatrace and to detect strange behaviour I guess.

ndmitchell commented 8 years ago

Confirmed, that's a lot more manageable. Thanks.