We believe that the fact that the study was limited to small to medium sized projects conducted using TSP is a major threat to making the generalized claims in the paper. There are also aspects of the data set and the analysis that were not clearly explained, which introduced concerns about the validity of some of the presentation. We think the authors are exploring a very important idea and have a valuable data set - we encourage them to continue this line of investigation and share the results with the community, even if the claims can only be made about TSP-like processes.
I have some concerns about your "low level" approach to data collection for the fix time. Your effort data is collected at a very low level - per developer in terms of time. The Boehm data and the Beck assertions are at the project-level and in terms of cost (usually). I don't think you can directly compare these things as they may not be linearly related and it's not clear how they would be aggregated. For example, a simple sum would not taking into account the critical path. For example two tasks that take 50 minutes each to fix delay the project by 50 minutes. But two that take 1 minute and 99 minutes delay the project by 99 minutes. I would expect that the cost is greater when the critical path is longer.
It would seem to me "interruptions" like meetings and requests for technical help might actually be work that is required to fix issues. It seems questionable to not account this time toward something.
In Figure 10 - I have some concerns about your sample sizes considering that there were 171 projects and most of the rows have less than 171 items. This would seem to indicate that different projects might have very different data sets and that some interesting results might be lost in merging them all together for the sake of getting the numbers to be big enough to make "general" conclusions. It seems like some projects could easily match the old view and this could be lost in the statistics.