Map state oral arguments sources to download

grossir commented 1 week ago

Ordered by population:

cal / California Supreme Court
- From 2016 to present
- Hosted on https://jcc.granicus.com/ which does not work for a NJ IP
- tex / Texas Supreme Court
  - From 1978 to 2012
  - Has transcripts
  - Has case data: docket number, case name, date argued
  - Newer audio files hosted by Texas Bar which also have case data
- texapp / Texas Courts of Appeals
  - 2nd, 3rd and 13th COA have oral arguments in the same format as tex, for 2008 to 2024, 2018, 2024, 2019 to 2024 reespectively
  - Some of the other COAs have official youtube channels
fla / Florida Supreme Court
- Links to: Videos available on facebook, youtube (from 2018 to present) and wsfu.org (from 1997 to present)
  - Transcripts available
fladistctapp
- 1st, 2nd, 4th district videos are uploaded to Youtube, but HTML directory has case metadata
ny / New York Court of Appeals
- From 2012 to present, videos hosted on Youtube
- Transcripts available
- Case data: only case name and month argued
pa / Pennsylvania Courts Couldn't find an oral arguments section in their website. Did find this, so it seems they don't publish the oral arguments?

The Commonwealth Court’s Internal Operating Procedures and the Pennsylvania Rules of Judicial Administration prohibit recording of oral arguments conducted and livestreamed by advanced video communication technology. See Section 502 of the Internal Operating Procedures of the Commonwealth Court, 210 Pa. Code § 69.502 (permitting only the recording by the Pennsylvania Cable Network (PCN) of en banc proceedings for future broadcast); Pennsylvania Rule of Judicial Administration 1910, Pa.R.J.A. 1910 (relating to broadcasting, recording and photography in the courtroom). See generally Section 124 of the Internal Operating Procedures of the Commonwealth Court, 210 Pa. Code § 69.124 (relating to video or teleconference proceedings). Violation of this directive may result in the imposition of sanctions.

Following the mention of "Pennsylvania Cable Network", I did found a courts section on that website with videos of oral arguments; but I can't find case data to link the audio properly

ohio / Ohio Supreme Court
- Videos hosted on Ohio Channel webpage
- Available from 1995 to present, Total Files: 4,397
- Case number, date argued; and for more recent cases, case name available
ga / Georgia Supreme Court
- Videos available only from April 2023 to present
- Case data available: case number, case name, date argued
nc / North Carolina Supreme Court
- Videos on their youtube channel
- Video title has case number and case name
mich / Michigan Supreme Court
- Videos on their youtube
- Video title has case number and case name
nj / New Jersey Supreme Court
- Video and audio are available, only from February 2024. However, on the general search interface the cases with oral arguments are highlighted, and the links could be collected from there. In general, there is data since 2005, as the listing in the public bucket shows
- Case name, case number and date argued are available on the "recent webcasts page" and on the search interface
va / Virginia supreme court
- Audio is available since 2014
- Case number and case name are available. Date argued is approximate
wa / Washington Supreme court
- Video available on tvw.org.
- Case data seems to be available on the video's description
ariz / Arizona Supreme Court
- Video available on "granicus.com", which as in cal, does not work for me
- case name and date argued available
- Data since 2006 to present
tenn / Tennessee Supreme Court
- 6773 audio tracks available on soundcloud with case name and case number data
- Videos on youtube
  
  Audio recordings of oral arguments in the appellate courts heard on or after May 1, 2013 are available 21 days after the oral argument. You may access the audio recording by searching the particular case and clicking on the SoundCloud link found within the case information. Within 10 days of the oral argument, parties may ask the court to exclude the audio recording from being posted. If the court grants the request, the oral argument will not be posted.
mass / Massachusetts Supreme Court
- Videos available on Youtube. Usually the case name and number are on the title
ind / Indiana Supreme Court
- 1371 audio files since 2010
- date argued and case name available
- Court of Appeals has 800 files, but I think we don't scrape that court yet
indtc / Indiana Tax Court
- 160 audio files since 2007
mo / Missouri Supreme Court
- This site has our servers blocked; so we probably won't be able to scrape oral arguments
- I find no comprehensive list of oral arguments, but they are linked on the Decision pages (example) which would be easy to iterate over.
- "Archives of each day's audio files since September 2007 are posted to that day's "cases scheduled" pages. "
md / Maryland Supreme Court
- mp4 files available since 2020. Page says they archive oral arguments "from May 2007 to the present", but there is no link for the older ones
- Recording or copying of any portion of the live webcast or the archived recording of a webcast is prohibited without the express permission of the Supreme Court
wis / Wisconsin Supreme Court
- wma files available since September 1997.
- Has case number, case name, and a date (date argued?)
colo / Colorado Courts
- Has a search API behind this page, but no listings. We may be able to get the full list using the endpoint
- The search page does not display much case data; but the API return a case name, number and a date
minn / Minnessota Supreme Court
- mp4 files available since September 2005. One has to filter by date or last digits of oral argument
- case name, case number and date oral argument available
sc / South Carolina Supreme Court
- mp4 files available since 2014
- case name, case number and date argued metadata available

mlissner commented 1 week ago

Looking good, @grossir! Now I have the next hard question: How many hours or files, approximately, on each — or put another way, where do we start? The other question is what do we do about video? We could probably start storing it, but we'd want to optimize/normalize the file types, and price out the storage costs, since they might start to matter....

Looks like this will be a big project.

grossir commented 1 week ago

I will try to calculate seconds available where possible, but I think the number of files is a decent proxy. Most sites do not list any oral arguments statistics, and I would have to implement basically a scraper to get the numbers.

I think the best way to start is to implement the sources that match our current model (audio files with case metadata), needing the least effort: we will just implement the scraper / backscraper.

Texas' courts tex and texapp hold a lot of data. Then va, tenn, ind and indtc and, a little trickier, nj, for the courts I have mapped so far

To include video would take us more time, having to implement model and doctor changes; changes to the frontend to watch the videos; and having to calculate storage costs. Do we want that, anyway? Why not extract the audio from the video? Related: #44

After scraping the courts that self host their audio, I think we should work on the ones that upload to Youtube, since they all have a related scraping / processing step. Luckily in this step we would scrape some of the big courts, like ny and fla

Finally, the ones that "self" host their videos, or use a provider like granicus (cal is one of those)

mlissner commented 1 week ago

That all sounds good. Start with the easy stuff and then move to the trickier stuff.

I'm not sure what we should do about video. Long term, probably the right thing is to extract audio from it, and to also host the video, so API users can choose if they want audio or video.

Hosting video is going to be expensive and complex, so maybe step one is just to scrape and store video with a cheap storage class, and step two will be to actually figure out how to serve it.

But for now, yes, let's finish the survey, and when we're ready, we can start with scraping audio, then do video in a second phase.

freelawproject / juriscraper

Map state oral arguments sources to download #1047