xxyzz / ostep-hw

Operating Systems: Three Easy Pieces(OSTEP) homework and project solutions
GNU General Public License v3.0
780 stars 181 forks source link

Question about homework for chapter6 #19

Closed kaibocai7 closed 1 year ago

kaibocai7 commented 1 year ago

Thanks for creating this repo.

I have one dumb question about the homework in chapter6 for measuring context switch time. For each loop, there should be 2 context switches happen right. One from parent to child and one from child back to parent, in this case should the real time be the final answer divide by 2?

Appreciate for any explanation.

xxyzz commented 1 year ago

I think you're right. But I'm confused by the test result. I tried the lmbench and here are the results:

My code(not divide by 2):

$ 1.out
system call: 0.075481 microseconds

context switch: 0.331194 microseconds

lmbench (1.02 microseconds):

$ taskset -c 0 ./bin/x86_64-linux-gnu/lat_ctx -N 10000 2
"size=0k ovr=0.27
2 1.02

I also find this context switch benchmark: https://openbenchmarking.org/test/pts/ctx-clock and run the gist code:

$ taskset -c 0 ./ctx_clock.out 
ctx: 152 clocks

divide 152 with the CPU base clock speed I get around 0.04 microseconds.

I don't know which one of them are more accurate. But lmbench's code is the most complicated and they even measure the overhead.

xxyzz commented 1 year ago

I have updated the code in this commit: https://github.com/xxyzz/ostep-hw/commit/42fc3eddae6d5cdccf449aaab44feb3ea9f3f567

I checked the code with the perf command and it shows there are 2,000,018 context switches so the final result needs to be divided by 2. Thank you for pointing out this error!

Outputs:

$ sudo perf stat ./1.out
system call: 0.195281 microseconds

context switch: 3.700195 microseconds

 Performance counter stats for './1.out':

          7,595.82 msec task-clock                       #    1.000 CPUs utilized             
         2,000,018      context-switches                 #  263.305 K/sec                     
                 0      cpu-migrations                   #    0.000 /sec