nipunarora / parikshan

Parikshan
0 stars 1 forks source link

comments on chapter 4 #5

Closed gailkaiser closed 6 years ago

gailkaiser commented 7 years ago

"Several existing record and replay have a much higher overhead as they record low level of nondeterminism in order to capture and replay the exact state of execution." seems to be missing a word or two

"We also did a survey of 217 real-world bugs, and found them similar to the bugs presented in this case study( more details regarding the survey can be found at section 3.3.3). In this section we explain the applications that have been used in the bug case-studies." better to move the paper study of similar bugs to after the case study experiments

lots of ? cites

4.2.5 HDFS - empty section!

some of the 4.2 subsections explain how you set up the experiments whereas other only say what the target system is, and the HDFS section is totally empty. All of them should describe and cite the system, mention some similar systems, and discuss the experimental setup.

Most of the bugs described require the debug container to already be running side by side with the production container when a certain input or situation happens, they cannot be cloned after the fact. Something needs to be said about this here, even if the issue is discussed later on.

I'm not clear on how the non-crashing semantic bugs would even be noticed, unless you know to look for it. What would be in logs that would give a clue?

"In fact, it was written in reverse." I don't understand what reverse means here.

"This happens because of a bug in a the way deleted rows are not interpreted once they leave the memtable in the CFS.getRangeSlice code i.e. the flush does not recognize the delete and the purged data does not contain the delete operation." Something wrong with wording here.

"This has to do with a caching problem for large inserts where large amount of data in a partitioned table." Something wrong with wording.

Some of the data for MySQL#26257 seems to be commented out, is that intentional? In any case, what is the intuition behind this odd data? Such intuition should be discussed for all the bug examples.

"It was reported that when two or more databases are replicated, and atleast one of them is ¿=db10." This is missing a clause or something.

"One of the most subtle bugs in production systems is caused due to concurrency errors." There are a lot more than one concurrency bug!

"The bug might manifest in accesslog, showing some of the logs to be corrupted." What would be in accesslog that would show corruption?

"Wait and press enter. You will see detection log entry and the insert log entry has disordered binlog index." I"m confused, is this deterministic or non-deterministic? And what is the disordering here?

"4.3.4.5 MySQL #791 In this subsection we describe the MySQL#791 performance bug" I thought this set was supposed to be all concurrency bugs.

"The client in Redis is scheduled to be closed ASAP for overcoming of output buffer limits in the masters log file." I cannot figure out what this means.

4.4 Summary is empty, and even if it had content the chapter should have a discussion section.

Why

Part II iProbe: Creating live debugging friendly applications???

nipunarora commented 7 years ago

"Several existing record and replay have a much higher overhead as they record low level of nondeterminism in order to capture and replay the exact state of execution." seems to be missing a word or two

fixed

"We also did a survey of 217 real-world bugs, and found them similar to the

fixed

bugs presented in this case study( more details regarding the survey can be found at section 3.3.3). In

this section we explain the applications that have been used in the bug case-studies." better to move the paper study of similar bugs to after the case study experiments

lots of ? cites

4.2.5 HDFS - empty section!

fixed

some of the 4.2 subsections explain how you set up the experiments whereas other only say what the target system is, and the HDFS section is totally empty. All of them should describe and cite the system, mention some similar systems, and discuss the experimental setup.

added missing details to each of them

Most of the bugs described require the debug container to already be running side by side with the production container when a certain input or situation happens, they cannot be cloned after the fact. Something needs to be said about this here, even if the issue is discussed later on.

I'm not clear on how the non-crashing semantic bugs would even be noticed, unless you know to look for it. What would be in logs that would give a clue?

I think this aspect is covered in the last chapter. This chapter does not focus on bug analysis, only on re-creating the bug. Debugging scenarios are discussed in deeper detail in the last chapter.

"In fact, it was written in reverse." I don't understand what reverse means here.

explained further

"This happens because of a bug in a the way deleted rows are not interpreted once they leave the memtable in the CFS.getRangeSlice code i.e. the flush does not recognize the delete and the purged data does not contain the delete operation." Something wrong with wording here.

fixed

"This has to do with a caching problem for large inserts where large amount of data in a partitioned

table." Something wrong with wording.

Some of the data for MySQL#26257 seems to be commented out, is that intentional? In any case, what is the intuition behind this odd data? Such intuition should be discussed for all the bug examples.

deleted the commented out portion

"It was reported that when two or more databases are replicated, and atleast one of them is ¿=db10." This is missing a clause or something.

fixed

"One of the most subtle bugs in production systems is caused due to concurrency errors." There are a lot more than one concurrency bug!

fixed

"The bug might manifest in accesslog, showing some of the logs to be corrupted." What would be in accesslog that would show corruption?

added an explanation

"Wait and press enter. You will see detection log entry and the insert log entry has disordered

binlog index." I"m confused, is this deterministic or non-deterministic? And what is the disordering here?

it should be out of order and not disordering, the log entry order is non-deterministic.

"4.3.4.5 MySQL #791

In this subsection we describe the MySQL#791 performance bug" I thought this set was supposed to be all concurrency bugs.

yes this was a copy paste mistake. It should be concurrency, it has been fixed.

"The client in Redis is scheduled to be closed ASAP for overcoming of output buffer limits in the masters

log file." I cannot figure out what this means.

4.4 Summary is empty, and even if it had content the chapter should have a discussion section.

Why

Added a summary

Part II

iProbe: Creating live debugging

friendly applications???

removed creating live debugging friendly applications