Data Intuition - Look at our data

TechnionYP5777 / Bugquery

Bug query

9 stars 1 forks source link

Data Intuition - Look at our data #115

Closed yonzarecki closed 7 years ago

yonzarecki commented 7 years ago

As suggested by Yossi, this is a good time to start looking at our stack-traces data, and look for hints within it. Can we see some similarity between traces that we can translate into better performing algorithms ?

We should add prints and sample instances from the db and see if we can realize something important from looking at our data.

yonzarecki commented 7 years ago

Moving to Sprint 2, as this issue is not critical to our commitments for Sprint 1.

yonzarecki commented 7 years ago

@tonylekhtman Can you help me get started ? Where do I need to go if i just wanna send some SQL queries and look at the results ?

tonylekhtman commented 7 years ago

The best way in my opinion is via mysql workbench. It presents the results in a nice tabular form. You can also send queries via the teminal. You need to write mysql -u -h -p

yonzarecki commented 7 years ago

I got it working, looking d:

yonzarecki commented 7 years ago

From looking at the questions I can see that most exceptions are under a "< code> >" tags, this can help with exception detection and extractions. (not always tho)

yonzarecki commented 7 years ago

There are also many duplicates in the DB (this may result in duplicate results in query time)

yonzarecki commented 7 years ago

Another idea is to keep an index of all rows seen, and give lower significance to rows not in the index (probably user-specific) in the distance functions.

yonzarecki commented 7 years ago

I suppose this is already implemented, but most newlines in the DB are represented by " ", if it doesn't it can help with exception parsing.

yonzarecki commented 7 years ago

We should look more into the exception trace print types. There are several of them, yet they all represent the same thing. A few examples are 268 vs. 261 vs. 262 etc.

A good idea is maybe to analyse these types better and make a "generic" print for us to compare, this way we won't miss a good answer because of different printing styles.

yonzarecki commented 7 years ago

I think I'm done for now, this gave us enough work for the time being. I'll open issues about these topics tomorrow.