crossminer / scava

https://eclipse.org/scava/
Eclipse Public License 2.0
18 stars 13 forks source link

Number of closed issues in GitLab repository (e.g. bugs.nonResolvedClosedBugs) #416

Closed creat89 closed 4 years ago

creat89 commented 4 years ago

I have opened this issue for future reference as well for asking the user case partners how to proceed according to their needs:

In https://github.com/crossminer/scava/issues/411#issue-514558434, it is indicated that the number of open (or not closed) issues in a specific repository doesn't match those observed in the GitLab website.

To be precise, the platforms indicates that there are 147 open issues of 156 possible ones (within the range of analysis), but GitLab indicates that there are only 48 open issues (i.e. 108 closed issues).

The problem of this mismatch, comes from the data returned from GitLab API rather than from Scava side. The explanation as follows:

The code in Scava that manages whether an issue has been closed on not, is not based on the status (in the API status) but on the date of closing (in the API _closedat). The reason is that, when we retrieve the issue, this might be closed already, but that doesn't mean that happened on the delta day. For example, we have Issue 1 created on the 5th November and closed on the 10th November; when Scava analyses the 5th November, it will know that Issue 1 was created on that date, but we hide that it is closed, because it will not be closed until the 10th November. When Scava will analyse the 10th November, the issue will be marked as closed.

So, GitLab, seems to not be filling in all the cases the closing_at element and just changing the status:

https://gitlab.ow2.org/api/v4/projects/sat4j%2Fsat4j/issues/143

https://gitlab.ow2.org/api/v4/projects/sat4j%2Fsat4j/issues/141

In the first example, GitLab indicates the date in which it was closed, but in the second example, it just states null.

On the GitLab API documentation (https://docs.gitlab.com/ee/api/issues.html#single-issue) it is only indicated this:

Note: The closed_by attribute was introduced in GitLab 10.6. This value will only be present for issues which were closed after GitLab 10.6 and when the user account that closed the issue still exists.

However, we do not make use of that attribute, but _closedat.

I might be able to create a specific case for GitLab where I force the setting of the status closed only if the closed_at attribute is null. Otherwise, the close status will be updated only when the delta day matches the closed_at value.

User case partners, should I implement the previously described solution?

mhow2 commented 4 years ago

Good catch !

So if I understand correctly:

I might be able to create a specific case for GitLab where I force the setting of the status closed only if the closed_at attribute is null. Otherwise, the close status will be updated only when the delta day matches the closed_at value.

Sounds like a good option. BTW it seems most of the issues have closed_at set to NULL for this specific project - I guess it was created/imported a rather long time ago. So the API is consistent in this regard.

# select iid,closed_at from issues where project_id=41 and closed_at is NOT NULL;
 iid |           closed_at           
-----+-------------------------------
 157 | 2019-10-16 12:18:36.906662+00
 143 | 2018-09-16 15:06:44.240354+00
 152 | 2018-12-15 21:28:32.430583+00
 147 | 2018-11-12 15:03:35.98302+00
 129 | 2018-11-12 15:58:31.960692+00
 145 | 2018-11-10 12:41:35.656496+00
 151 | 2018-12-07 19:19:33.2596+00
 154 | 2019-01-28 12:27:27.811622+00
 144 | 2018-11-12 15:04:15.675373+00
(9 rows)

Above we only see the most recent ones.

creat89 commented 4 years ago

Hello @mhow2,

I have pushed a commit that should solve the problem with old GitLab issues. I have used the following strategy. If the issue has a closed status but the _closingat is null, so, I used for closing date the _updatedat value, which should correspond to the closing date. In that case, we should get better accuracy in the methods without considering that the issue has been closed since it was created.