LLNL / scraper

Python library for getting metadata from source code hosting tools
MIT License
49 stars 23 forks source link

Use logging instead of printing #58

Open vsoch opened 3 years ago

vsoch commented 3 years ago

Heyo! I'm wondering if instead of having a bunch of print statements (that the user cannot control)

Retrieving repository info for LLNL/b-mpi3
Checking GitHub API token... Token validated.
Auto-retry limit for requests set to 10.
Reading '/home/vanessa/Desktop/Code/contributor-ci/contributor_ci/main/extractors/repos/repos-info.gql' ... File read!
Sending GraphQL query...
Checking response...
HTTP STATUS 200 OK
API Status {"limit": 5000, "remaining": 4998, "reset": 1624819230}
Data received!

It would be possible and make sense to using logging instead, so it can be made quiet? For my user case, I have a command that hits a few API endpoints and then needs to pipe to file, and I'm not able to control this output.

IanLee1521 commented 3 years ago

I think this is a great idea! No objections from me.

Is there any chance you'd be willing to work this in to a pull request that we could review and merge?

LRWeber commented 3 years ago

FWIW, there is actually a verbosity parameter included in the query methods. e.g.: https://github.com/LLNL/scraper/blob/471e06df3c2547b181974b50000dbe32dd86cf54/scraper/github/queryManager.py#L146 https://github.com/LLNL/scraper/blob/471e06df3c2547b181974b50000dbe32dd86cf54/scraper/github/queryManager.py#L161-L165

leebrian commented 3 years ago

I think a PR with these updates would be nice.

I’ve gotten around needing this by just piping the output from the process to whatever log file I like, and using the verbosity param that LRWeber points out.

vsoch commented 3 years ago

Setting verbosity to -1 still can't control the print statements:

$ cci cfa --terminal https://github.com/vsoch/salad 
Generating CFA for https://github.com/vsoch/salad
Stored new data file path '/home/vanessa/Desktop/Code/contributor-ci/.cci/data/latest/cci-repos.json'
Importing existing data file '/home/vanessa/Desktop/Code/contributor-ci/.cci/data/latest/cci-repos.json' ... Imported!

Retrieving repository info for vsoch/salad
Checking GitHub API token... Token validated.
Auto-retry limit for requests set to 10.
# nothing above this line should be printed!
---
repository: vsoch/salad
title: vsoch/salad
---
vsoch commented 3 years ago

Would y'all be okay with removing these custom print functions in favor of standard python logging? It looks like you use it in other parts of the library but they didn't make it here! Do you remember if there was a special reason to do this?

LRWeber commented 3 years ago

The primary reason for using the basic print rather than logging was to enable some of the more human-friendly feedback for when the commands are run directly in real-time, for example the visible countdown when waiting to retry a query (especially when the GitHub API limit triggers a much longer wait time).

In short, this update will mean more extensive re-working of how some of the info is presented. However, if we believe the tradeoff is worthwhile, I'm sure we can come up with a similarly informative logging friendly version.