researchart / swan17

0 stars 0 forks source link

Swanaccount: Find, Understand, and Extend Development Screencasts on YouTube #2

Open timm opened 7 years ago

timm commented 7 years ago

https://github.com/researchart/swan17/blob/master/pdf/SWANAccount.pdf

image

timm commented 7 years ago

_AUTHORS: Important. Do NOT reply till all three reviews are here (until then, we will delete your comments)_.


Reviewer1

Insert reviewer github id here ==> gray-swan

Recommendation (select one)

Summary (1 para)

The paper presents three case studies on software development screencasts as an information source in developers' knowledge seeking. The results of the case studies show that (i) developer screencasts present a higher similarity in their frames than other videotypes, which serves in their identification; (ii) it is possible to identify the main screencast topics by using their transcripts; and (iii) developer screencasts can be tied to relevant APIs by leveraging the textual similarity between them and the screencasts' transcripts.

Advocacy (accept since, reject since, 1 para)

The paper presents appealing ideas to handle the large amount of information embedded in developer screencasts, which have been seldom exploited in software engineering research. The ideas are developed in three case studies addressing different questions about the visual and spoken content developer screencasts. Given the promising topic and results, I advocate for the acceptance of the paper once the major issues pointed out below are addressed.

List of "Pros"

List of "Cons"

Changes needed before I can recommend accept (if any)

s# = section #

Major

s3. To answer RQ2 two analyses are performed, one using screencast titles, the other one using their transcripts. However, there is no comparison between the two analyses, neither discussion on how these analyses complement/overlap. s3-s4. The conclusions (squared boxes) of each part of the study are not very clear and do not seem to answer the original RQs. In particular, RQ2 and RQ3 seem to be answered by providing specific but detached statements from their respective discussions. Also, they are sometimes focused on certain, specific results without any clear justification, and sometimes they go beyond the evidence of the study. For example:

s4. Recall is used to assess the identification of relevant APIs to screencasts. However, unless a throughout analysis of all 9,455 was done to build the gold set for each one of the 32 screencasts in Section 4, recall does not seem as an appropriate quality measure in this case. s7. The final conclusions is misleading, as it is currently mixing results from the three case studies, even though they use different study subjects. Specifically, one of the conclusions of the paper is that six main topics exist in the screencasts of Java, and that one of them is the how-to screencasts (which doesn't seem to be the case). Where is how-to listed as a screencast topic?

Need clarification/justification/modification

s2. "The Cosine algorithm is the best algorithm to identify a development screencast from other video types (highest concentration of similarity values)." => the best one within the studied algorithms s3. "Method and system operations are also frequently occurring tasks during development" => in development screencasts s3. "Interestingly, UI operations are also shown to be one of the main activities performed when comprehending software [16]." => This is indeed interesting, but also out-of-topic as program comprehension is not the main issue here. It needs to be pointed out that UI operations are one of the topics within the screencasts. s3. Table 1 can be more informative by adding the frequency of each topic. It is unclear if the topics presented in this table are all the ones found or if other topics were identified. If there are more topics, what is so interesting about these ones? Also, it is unclear what "repeatable tasks" are. s3. If only nouns from transcripts were considered when applying LDA, why Table 1 shows verbs as part of the terms describing a topic? s3. The conclusions of Section 3 are focused on certain topics (database operations and testing) without any particular reason. s4. Why was TaskNav chosen for the study? s4. While the first sentence of the section states that 35 screencasts were used in the study, Figure 5 states that 32 screencasts where used. Which one is it? s4. "This result is more accurate than random, which has a precision below 1%." => Unclear. Can you elaborate on this? s6. The last paragraph is a summary of the work presented in the paper, more than a a discussion of results or limitations of the study. s7. "In this stage, a couple of relevant API documents are provided within a list of 10 items." => which stage?

Typos, grammar errors

s1. "human interactions [9] e.g. " => add comma before e.g. s1. "A task in a development screencast can be assigned to an topic of so ware development" => a topic s2. "We calculated the frame similarity of every video of every video type" => similarity of every video type s2. "Development screencasts seems to be more static" => seem to be s3. "We performed two different analysis about the topics" => analyses s3. "We found that such need is present in development screencasts" => such a need s4. "for every development task that were performed" => that was s4. "the relevant documentation pages were found in the top-10 retrieved position." => positions s4. "the text that might appear in a scene e.g. an IDE" => add comma before e.g. s5. "MacLeoad et al.[17]" => MacLeod s5. "We extend their work by linking screencasts with API documents and show how similar they actually are." => and by showing s6. "we found that frames in a development screencast seems to be very much alike" => seem to be s6. "Leveraging our results, a simple tool..." => By leveraging

timm commented 7 years ago

_AUTHORS: Important. Do NOT reply till all three reviews are here (until then, we will delete your comments)_.


Reviewer2

Insert reviewer github id here ==>

Recommendation (select one)

Summary (1 para)

Advocacy (accept since, reject since, 1 para)

I think there are flaws in this paper, namely related to justification. However, I also think this could spark very useful discussion on streamcasts, their usages, and even techniques for good streamcasts. So despite these flaws, I recommend acceptance, however ask for a look at the findings boxes at the end of each section. Some of these should be removed.

List of "Pros"

List of "Cons"

Section 3 -Figure 4 is too small to read.

Section 4 I like the idea of this section, that is finding relevant API Documents to "attach" to streamcasts. However, given the very low precision of your search technique (>10% on all Top N's), I do not see the usefulness of your current approach. If you attach 10 documents to a screencast, and only one is relevant, that is not helpful to the viewer. While this is still better than random, I would like the authors to be more direct in admitting this limitation. Again, the "findings box" at the end of the section seems poorly justified. Specifically, "Extracting the content of the development screencast .g., the code showed on the screen when using an IDE—might lead to a higher precision/recall to identify a development screencast." Would be great in a Future work or conclusion section. It is, in this context, presented as some kind of finding from the research, when it is more accurately a hypothesis of a future study.

Typos: - I didn't put a large effort into tracking typos, but I try to note the glaring ones that stand out. There is no author section on this paper p1 explicitely = explicitly "However, attaining such tasks" - should be completing such tasks "A task in a development screencast can be assigned to an topic of so ware development" - I cannot

Section 4 - first sentence "miss-spelt" - misspelled

Changes needed before I can recommend accept (if any)

Overall, I think the "findings boxes" at the end of each section need to be pruned to what the studies found in concrete, statistical terms. Further, I think Section 2 needs to be better justified. Finally, I strongly urge or a professional proofread of some kind.

timm commented 7 years ago

_AUTHORS: Important. Do NOT reply till all three reviews are here (until then, we will delete your comments)_.


Reviewer3

anonproton

Recommendation (select one)

Summary (1 para)

The paper explores the idea of using developer screen-cast as an additional source of information in developer's knowledge seeking activity to answer questions related to development. They found that developer screen-casts have high level of similarity in their frames which can be exploited to identify screen-casts from other videos. The topics of developer screen-casts can be extracted by mining their transcripts which can further be mapped to relevant APIs by calculating textual similarity.

Advocacy (accept since, reject since, 1 para)

I am on the fence for advocating this paper. While on one hand, I find the topic has some technical merit and proposes a simple intuitive solution, but on the other hand I feel the problem lacks clear motivation. Is it really a problem to search for screen-casts on youtube right now? If not, does the the proposed solution, increases the user experience by many folds to justify implementing it? (don't think so, as precision/recall numbers don't seem quite high)

A quick search on youtube with "screencast" prefixed to the "search text" gave me the necessary results I was expecting. Again, I might have overlooked certain scenarios but please justify cases where you think a simple common sense based search approach might not result in the expected outcome.

For me, the actual merit comes from identifying and overcoming challenges in developer's current knowledge seeking exercise. The authors do make some attempt in Section 3. More of that, would help position this work in the right context. Overall, I'd be in a position to appreciate the work much more if the authors focused more on the end-to-end scenario of how one can potentially improve developers' knowledge seeking experience.

List of "Pros"

List of "Cons"

Changes needed before I can recommend accept (if any)

Please see "advocacy" section.

obaysal commented 7 years ago

@timm I'd match issue IDs according to our master file. The template otherwise looks good.

timm commented 7 years ago

Authors? Comments? Good responses to the above could lead to acceptance.

SWANAccount commented 7 years ago

Hi Tim,

we will respond on Wednesday.

For sure, we will try to resolve the mentioned issues.

So far, thx for the comments!

Best regards, Authors

Tim Menzies notifications@github.com schrieb am So. 25. Juni 2017 um 17:39:

Authors? Comments? Good responses to the above could lead to acceptance.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/researchart/swan17/issues/2#issuecomment-310909838, or mute the thread https://github.com/notifications/unsubscribe-auth/AbeZGfEMDQ1B9iMUgbQFQEtU9D8-uL-Bks5sHn8igaJpZM4Nvc3B .

SWANAccount commented 7 years ago

Dear reviewers,

again, thank you for your comments. Attached our responses.

gray-swan.txt swan-reviewer-18.txt anonproton.txt

SWANAccount commented 7 years ago

Dear reviewers,

please consider below the latest version of our answers, which are slightly improved.

Based on the call for papers and the notification messages by the chairs, we assume that you do not need to provide the revision of the paper at this stage. Otherwise, please advise us on how to proceed.

Thank you.

@gray-swan gray-swan.txt

@swan-reviewer-18 swan-reviewer-18.txt

@anonproton anonproton.txt