Closed teovin closed 2 months ago
I just did a quick pass for code style, and LGTM! Left two tiny suggestions 🙂
I also took the liberty of adding Jack as a reviewer, who I expect might be more equipped than me to address your more detailed questions 🙂
Thank you Becky, I addressed your suggestions in my last commit. And I will work on any changes that Jack might suggest, especially those around the questions I had as you mentioned.
I noticed we are hiding some elements like .parties, .decisiondate and .docketnumber in case-text class. What's the reasoning behind this?
We want to render the top part of the head matter ourselves, rather than use the info printed in the book -- that lets us provide more consistent formatting between cases published in different books. Check out cap_header.html
for where that's done. My guess is you have to adapt that business logic to also work with CL.
So some fields are hidden because the custom header makes them redundant. I wasn't part of this, but I'm guessing we're hiding other fields like syllabus and parties simply for user preference. As long as we're rendering the same as cases fetched from the CAP API, let's not revisit that decision for now.
I haven't looked if you're doing this yet -- I think we'll want to record which courtlistener field was used to populate the case. For example I'm pretty sure if we do need footnote_regexes, we only need it if xml_harvard was the source.
This looks great -- I think with updates it'll be good to test on stage.
... but we might want a feature flag since xml conversion isn't ready yet.
We want to render the top part of the head matter ourselves, rather than use the info printed in the book -- that lets us provide more consistent formatting between cases published in different books. Check out
cap_header.html
for where that's done. My guess is you have to adapt that business logic to also work with CL.So some fields are hidden because the custom header makes them redundant. I wasn't part of this, but I'm guessing we're hiding other fields like syllabus and parties simply for user preference. As long as we're rendering the same as cases fetched from the CAP API, let's not revisit that decision for now.
I added a template for court listener modeling it after cap_header.html. One change I made to both was to remove the div with legal_doc.get_title as get_title method didn't exist, and so it wasn't rendering anything.
This is a WIP PR for the CL case XML -> HTML conversion integration.
Things that were done:
xml_harvard
fields from the opinions endpoint if the cluster has afilepath_json_harvard
, otherwise thehtml
field will be used (withplain_text
as worst case scenario).U
, 2000 with sourceCU
). Source descriptions here.type
andid
attributes in the source xml.A few bug fixes were made:
effectiveDate
to prevent errors that are thrown if the CL API returns a date string longer than 25 chars, I saw that was the case for some clusters with the time offset including seconds.id
s with thecluster_id
s. Because the ids and the cluster_ids do not match in the search endpoint results, the subsequent clusters endpoint call with id was erroring out.opinion id
s (grabbed from cluster endpoint responsesub_opinions
field) since we need to look at all sub_opinions to construct the Harvard xml. Previously the search result id was being used.xml_harvard
data, and nohtml
, so I defaulted to use theplain_text
field of the opinion.Things to consider:
Sample converted legal doc (chopped):
This is how it would look like if the elements I mentioned above weren't set to
display: none
.A case that both CAP and CL return, and this is how they look like when imported (both chopped):
CAP (with
display: none
removed from.case-text .syllabus
):CourtListener (with
display: none
removed from elements in headmatter):