MrDiggles2 / cru-scrape

Scraper of CRU sites
0 stars 0 forks source link

cli-refactor-with-typer #9

Closed shvets92 closed 2 months ago

shvets92 commented 2 months ago

still need to update docs, tried running this

poetry run python main.py 2002 https://www.maine.gov/ifw/

and got this as the output which seems short so need to make sure it's not broken and am too tired to do so right now

Found url: http://web.archive.org/web/20021021120647/http://www.maine.gov:80/ifw/   for year: 2002
Starting http://web.archive.org/web/20021021120647/http://www.maine.gov:80/ifw/
2024-07-27 04:10:27 [scrapy.utils.log] INFO: Scrapy 2.11.2 started (bot: scrapybot)
2024-07-27 04:10:27 [scrapy.utils.log] INFO: Versions: lxml 5.2.2.0, libxml2 2.11.7, cssselect 1.2.0, parsel 1.9.1, w3lib 2.2.1, Twisted 24.3.0, Python 3.9.9 (tags/v3.9.9:ccb0e6a, Nov 15 2021, 18:08:50) [MSC v.1929 64 bit (AMD64)], pyOpenSSL 24.2.1 (OpenSSL 3.3.1 4 Jun 2024), cryptography 43.0.0, Platform Windows-10-10.0.18363-SP0
Took 0.6840410232543945 seconds

Should avoid invalid years though

$ poetry run python main.py 1990 https://www.maine.gov/ifw/
Couldn't find a group for the year: 1990
The years that were found are: ['2002', '2003', '2004', '2005', '2006', '2007', '2008', '2009', '2010', '2011', '2012', '2013', '2014', '2015', '2016', '2017', '2018', '2019', '2020', '2021', '2022', '2023', '2024']

Feel free to pull this down and try it though

shvets92 commented 2 months ago

also I won't but I am allowed to merge this branch, so the branch protection rules may not be working

shvets92 commented 2 months ago

also these are the docs to the typer cli library: https://typer.tiangolo.com/

shvets92 commented 2 months ago

I see more results now so I think this is working. I added the verbose flag too. If you didn't already know, you can either use the --help flag or call the script with no args to show the help menu.

shvets92 commented 2 months ago

Feel free to merge when you're ready

shvets92 commented 2 months ago

also if you're looking into loggers I've had a good experience with loguru for my work cli. It doesn't really require any boiler plate to setup and has colors built into the logs.

I just import like this: from loguru import logger as log (because you don't "logger" messages you "log" them)

and then log like so:

log.info("informational message")
log.debug("detailed message")
log.warning("don't you touch those cookies young man")
log.error("everything is on fire, call your mother")

It takes in some ENV vars for setting the level if you want but if you want to "hook it up" to the verbose flag you can do something like this:

import sys
if verbose:
    log.add(sys.stderr, level="DEBUG")

That's what I do at work but I think you might need to use sys.stdout if you want to be able to pipe those logs to shell commands with the | char