Closed mantono closed 8 years ago
After doing some testing, I have come to the conclusion that pull requests, in fact, is not included when fetching regular issue from the API. I am still clueless to why we get more downloaded issues than it currently exists in the repository, but it seems like it has nothing to do with pull requests anyway.
It seems like I was wrong :( This issue was in the tdesktop data set, which caused the program to crash due to its empty body which resulted in a NullPointerException. This bug has now been fixed, but the offending issue is a pull request and not a regular issue, which means that the filtering is not done.
After some investigation, it is has now been found that pull requests can now be filtered with
if(entry.getKey().getPullRequest().getHtmlUrl() != null)
If the URL is null, then it is not a pull request. Unforunately, all previous downloaded data was downloaded without this check, AND all issues which do not have a comment was discarded as well (#22). We will therefore have to download all data sets again (#1).
From https://developer.github.com/v3/issues/#list-issues-for-a-repository
We will have to check the _pullrequest key and remove pull requests from our issue collection, as they will contribute anything to our artefact.