research-software-directory / RSD-as-a-service

This repo contains the new RSD-as-a-service implementation
https://research.software
24 stars 14 forks source link

Scraper Utils.get() throws exception when API_CREDENTIALS_GITHUB is unset #198

Closed cmeessen closed 2 years ago

cmeessen commented 2 years ago

When I comment out API_CREDENTIALS_GITHUB in .env, the scraper throws exceptions when trying to send the HTTP request:

Exception when handling data from url https://github.com/GooglingTheCancerGenome/sv-channels/:
java.lang.IllegalArgumentException: wrong number, 0, of parameters
    at java.net.http/jdk.internal.net.http.common.Utils.newIAE(Utils.java:286)
    at java.net.http/jdk.internal.net.http.HttpRequestBuilderImpl.headers(HttpRequestBuilderImpl.java:135)
    at java.net.http/jdk.internal.net.http.HttpRequestBuilderImpl.headers(HttpRequestBuilderImpl.java:43)
    at nl.esciencecenter.rsd.scraper.Utils.get(Utils.java:24)
    at nl.esciencecenter.rsd.scraper.GithubSI.lambda$license$3(GithubSI.java:29)
    at java.base/java.util.Optional.orElseGet(Optional.java:364)
    at nl.esciencecenter.rsd.scraper.GithubSI.license(GithubSI.java:29)
    at nl.esciencecenter.rsd.scraper.MainLicenses.main(MainLicenses.java:25)

The comments in the .env file suggest to comment out the API_CREDENTIALS_GITHUB variable if it is not needed.

Note: the token is required either for private projects, or when retrieving the commits/contributions. Languages and licenses could be retrieved without a token, e.g.

ewan-escience commented 2 years ago

That is indeed an old bug I need to solve. The documentation says that that exception is only thrown when the number of parameters is odd, but its implementation also apparently rejects zero. I'll look into it.

I think there are more env variables that will cause errors when they are commented out, I'll look into that too.