systemd / casync

Content-Addressable Data Synchronization Tool
1.51k stars 117 forks source link

casync extract: Add time statistics #192

Closed elboulangero closed 5 years ago

elboulangero commented 5 years ago

This MR adds the following lines to casync extract verbose output:

Time spent seeding: 6.529038 secs (9%)
Time spent decoding: 63.074518 secs (90%)
Total extract time: 69.623602 secs (00:01:10)

Here are a few comments:

keszybz commented 5 years ago

We already have now, which returns the time in ns. I think it'd be best to build any time-reporting functionality on that. 1s granularity is obviously not going to be enough for various people, so I think it's better to implement this using ns from the start.

seeding and decoding to describe the two main steps of casync extract

Sounds OK to me.

elboulangero commented 5 years ago

We already have now, which returns the time in ns. I think it'd be best to build any time-reporting functionality on that. 1s granularity is obviously not going to be enough for various people, so I think it's better to implement this using ns from the start.

Ack, I pushed changes accordingly. I use now to get the time.

I also updated the MR description to reflect how the new output looks like.

elboulangero commented 5 years ago

Overall, the big change here is that I imported format_timespan() from systemd.

I took on me to save it in a time-util.{c,h} files, and to move other time utils from util.h to time-util.h. However it's matter of taste, so I can move the whole thing back to util.h if you prefer.

I also decided to let the code of format_timespan() unmodified (to save me the trouble to dive into this smart code), but there are a few mismatch with casync's code, that should be addressed:

Point one We now have usec_t and nsec_t types, that are used in format_timespan() only, and nowhere else in the code. If you see value in having these types around, I guess it would be good to use them everywhere where it's suitable in the code (basically just replace a bunch of uint64_t here and there). Otherwise, we can stick to uint64_t, and then remove usec_t and nsec_t from time-util.h.

Point two casync works with nseconds everywhere, but format_timespan() expects useconds. We can live with that, but maybe we can do better? To be more explicit:

I could add a now_usec() to casync, and use it in all the new functions I introduced to do time measurements. Just so that I work with usecs all along. The downside is that in some places of the code we'll have usecs, and other places we'll have nsecs. It can be a bit error-prone.

We could also rework format_timespan() to take usecs instead of nsecs for its input. That would probably be the easiest way, and that would avoid mixing usecs and nsecs in the whole casync code.

Or maybe I'm just being too complicated :sweat_smile:

Anyway, any feedback is welcome as usual!

poettering commented 5 years ago

Hmm, let's stick to uint64_t as type in casync, and only deal with ns as unit. Patching through format_timespan() fixing up the factor sounds easy enough to me. Yes, this means systemd and casync will deviate from each other in this case, but I think that's fine, if we do it for a reason.

In casync we mostly deal with ns, since we mostly deal with file system objects, and Linux exposes ns granularity for those, hence I think it is a good idea to standardize on ns time units. In systemd that's a bit different though...

elboulangero commented 5 years ago

Ok pushed it all according to your last comment, we stick to uint64_t and nseconds everywhere. I think this time it's good to be merged.