Open grossir opened 3 months ago
docker exec -it cl-django python /opt/courtlistener/manage.py cl_back_scrape_opinions --courts juriscraper.opinions.united_states.state.nc --backscrape-start=2020 --backscrape-end=2021
docker exec -it cl-django python /opt/courtlistener/manage.py cl_back_scrape_opinions --courts juriscraper.opinions.united_states.state.ncctapp --backscrape-start=2020 --backscrape-end=2022
Part of #929
On giving this a second look, I notice that we are missing records from the end of the year because of the way the scraper creates the url
self.url = "http://appellate.nccourts.org/opinions/?c=sc&year=%s" % date.today().year
which will miss opinions near the change of year, which seem to be posted in early january, but put under the previous year linkFor example, for year 2021 we are missing all
nc
records under theMandate: 6 January 2022
sectionThis gap could be filled by simply running the current backscraper which will try to download everything again
nc
Between September 25, 2020 and February 05, 2021 we have 0 documents. We are missing around 50 published opinions from late 2020.
ncctapp
From November 4, 2020 to January 1st, 2022 we have 0 documents. There is data in the source for this time period