eclipse-researchlabs / scava

Eclipse Public License 2.0
7 stars 1 forks source link

clarify factoids metrics and values #18

Open davidediruscio opened 4 years ago

davidediruscio commented 4 years ago

I have started to have a closer look to factoids in SCAVA:

Can we please clarify the following:

davidediruscio commented 4 years ago

Some factoid metrics should have been improved with commit https://github.com/crossminer/scava/commit/b243dd5d0a48c1bc893a92a03d8c276f93dcc85c. However, we will check some of those that are related to our deliverables.

davidediruscio commented 4 years ago

@mhow2

Not all summary on /factoids are documented

Thank you for brining the documentation issue to us. The factoids were directly ported to Scava from OSSMETER. We will update these as soon as we can.

It looks to me that not all factoids can provide a STAR score....

I would suggest that you raise this during the meeting in Paris. Perhaps Davide or Yannis maybe able to justify the reasoning behind this decision to implement the star rating in OSSMETER.

For some of the facts it's not always clear to me what to conclude from the STARS score. Is it good or bad, etc.

With regards to the number of stars, I believe the higher number of 'STARS' relates to a more positive outcome. Again this is based on reading the code, I would suggest to also raise this is Paris.

Some of the fact's (on /f) text mention strange chars like (� %)

Regarding the strange character you are witnessing are related to an error in the calculations ported from OSSMETER. Essentially, the current code does not prevent the calculation from dividing by 0. Since the code converts from a float to a String the character (which refers to NaN) is returned. We will correct the instances of this happening in factoids relating to our WP.

We don't have much info about the time period that has been considered for compute the facts. It's based on the whole available data I suppose.

You are correct. Currently, Scava requires the user to specify the date range for analysis for a project, either through a fixed date analysis (were a user specifies a start and end date for analysis) or via a daily analysis (were the user only specifies a start date). I know the information to calculate, let's say the number of days of analysis, can be calculated from information within the database. However, I am unsure to whom is responsible for making this visible to an end user. I agree with you that this information is vital when interpreting the factoids.

There are factoids with null value.

Looking at the information you provided to us, the factoids that are returning null I believe that are related to CWI's WP. @tdegueul should be able to help clear this up.

davidediruscio commented 4 years ago

We need to do a check pass on factoids accuracy and the question raised in this issue. Who can help ?

davidediruscio commented 4 years ago

@mhow2, I'll discuss this with @nnamokon tomorrow

davidediruscio commented 4 years ago

Hello @mhow2,

First of all, it seems that at a the day you raised the bug, I didn't understand correctly that there was an issue with the weekly factoid. Now, I have checked and yes, there was an issue with but with the transient metric that was charged of calculating the percentage of comments per day. In this case, the error is that it was setting a percentage of 14.29 when the days did not have any comments. This issue was happening with its similar version but of newsgroups. I have pushed the changes in https://github.com/crossminer/scava/commit/3a95498638d7164a9ecb99334c4bbf7d90672b58. However, it might be too late for projects that have been already analyzed.

Concerning with the documentation of factoids. This will be done by @nnamokon, he will explain which are the reasons or the the ranges to give specific stars to the projects.

With respect to the factoids that are null, those seem to be related to Rascal, thus @tdegueul should be able to fix the missing factoid field.