freelawproject / juriscraper

An API to scrape American court websites for metadata.
https://free.law/juriscraper/
BSD 2-Clause "Simplified" License
357 stars 106 forks source link

GA: UnexpectedContentTypeError: https://efast.gaappeals.us/ #985

Open sentry-io[bot] opened 5 months ago

sentry-io[bot] commented 5 months ago

Sentry Issue: COURTLISTENER-70P

UnexpectedContentTypeError: https://efast.gaappeals.us/download?filingId=9c51bac3-cf8b-4bc5-b906-5b54f84e5d89
'"application/pdf;charset=utf-8" not in ['application/pdf']

Now that this is logged, it's all going to come to me and I'm going to have to triage it all. We should tweak the code so that these kinds of issues can be automatically assigned to you or Gianfranco, @flooie.

Filed by @mlissner

grossir commented 5 months ago

There is a PR waiting for review / merge to solve a bunch of these errors freelawproject/courtlistener#3938

Some others are actual errors that happen from time to time and should be archived in Sentry once they happen, for example in nev

mlissner commented 5 months ago

Yes, but the problem with the logger.error is that it throws sentry errors I can't automatically assign. That means I get them all and have to forward them (like that other one I keep forwarding you).

Is there another way for Sentry to capture the error such that I can auto-assign it?

grossir commented 5 months ago

Is there another way for Sentry to capture the error such that I can auto-assign it?

So, the ownership rules we have do not work on logger.error events because these events don't have a traceback "path". The path appears at the top, and looks like ValueError cl.scrapers.management.commands.cl_scrape_opinions in parse_and_scrape_site

A possible solution is to assign a Sentry tag, let's say source: juriscraper to the logger.error events, and create a new ownership rule to assign those to me or @flooie

There may be other solutions (put "path" information on the proper variables so our current ownership rules work) but I would need to investigate further

mlissner commented 5 months ago

Tag seems like it could work? What about just throwing the error?