freelawproject / juriscraper

An API to scrape American court websites for metadata.
https://free.law/juriscraper/
BSD 2-Clause "Simplified" License
378 stars 111 forks source link

Collect neutral citations for `vt` #1150

Open grossir opened 3 months ago

grossir commented 3 months ago

They are inside the documents

vt example: https://www.vermontjudiciary.org/sites/default/files/documents/op24-015.pdf image

vt_criminal example: https://www.vermontjudiciary.org/sites/default/files/documents/eo22-076_0.pdf image

flooie commented 2 months ago

This should be take care of thanks @grossir

grossir commented 5 days ago

Ran the recently merged command

./manage.py update_from_text --courts juriscraper.opinions.united_states.state.vt --cluster-status Published --date-filed-gte 2017-01-01 --date-filed-lte 2024-01-01

From:

courtlistener=> select volume, count(*) from search_citation where reporter = 'VT' group by volume order by 1 desc
 volume | count 
--------+-------
   2024 |    13
   2017 |    43
  2016 |   143
   2015 |   149
   2014 |   131
....  more years ...

To:

 volume | count 
--------+-------
   2024 |    63
   2023 |    11
   2022 |    39
   2021 |   100
   2020 |    98
   2019 |    88
   2018 |   137
   2017 |   109
   2016 |   143
   2015 |   149
   2014 |   131
   2013 |   115
   2012 |   105
   2011 |   142
   2010 |   123
   2009 |   127
   2008 |   139
   2007 |   200
   2006 |   147
   2005 |   135
   2004 |   128
   2003 |   110

Having a gain of 589 citations for the years 2017 to 2024, from 56 to 645. However, we can also see there are big gaps in the years 2022 and 2023; and probably some missing records in the other years

UPDATE: after filling the gaps and collecting 201 new opinions, we have the following citation counts, for a total of 658 new VT citations

   2023 |    58
   2022 |    61