Closed leebrian closed 6 years ago
You got it. I added one for GitLab. If you let me know what way you'd like to pass configs, I can complete the todo for GitLab in a different pull request. So default can not to be to pound with api calls, but if needed, users can select it.
I think you might have missed the change for the GitLab? Looks like you only took out the comment there.
As for as configs... what do you mean by that? Not sure I'm following.
There should be a new line at 286 with a todo for gitlab. And line 25 removed the todo for GitHub.
Re: parameters The core parameters come from config.json and are overriden in gen_code_gov_json.py.main. And then there are extra parameters passed in as command line arguments and parsed with argparse that override config.json. But there are also some like "skip-labor-hours" that is only passed as an argument to main and is passed down to process_config as an extra method argument.
Do you have a preference for how to pass? Or you're ok with me adding another argument for passing a new "fetch-remote-license" that the GitLab class will see and have a default value to False?
Ah ok.
I see what you mean.
I think that the compute_labor_hours
(https://github.com/LLNL/scraper/blob/1d373b4b4b339b249b0c553ad8248ea45913cad1/scraper/gen_code_gov_json.py#L77) setting should be moved into the config up front instead:
https://github.com/LLNL/scraper/blob/1d373b4b4b339b249b0c553ad8248ea45913cad1/scraper/gen_code_gov_json.py#L48-L58
That is the new way, and I may just not have gotten that one reference.
That makes sense. I'll refactor out the gitlab bit and add the param to config.json, but not sure when as it's not as important for our catalog (GitLab is used internally so any OSS projects will get cloned out to Github).
Shall I go ahead and merge this in for now, and you can submit another PR for the next change (aka gitlab) ?
Yes please, if it's ok with you. I don't want to make the changes coupled so this one lingers.
CDC wants to automatically capture license info where possible. Added these few lines to use github info where it's already in the response. Didn't add GitLab as that makes a separate http request to get license info.
Not sure if you wanted to parameterize this and you have two different ways (config file, and command line) so didn't add this in. But please let me know if you have a preference and I'll add parameters. Currently will ignore if no license info is available for repo.