freelawproject / courtlistener

A fully-searchable and accessible archive of court data including growing repositories of opinions, oral arguments, judges, judicial financial records, and federal filings.
https://www.courtlistener.com
Other
532 stars 148 forks source link

citations going the wrong way in time #557

Open idc9 opened 8 years ago

idc9 commented 8 years ago

There are a number of edges going forwards in time in the edge list from bulk downloads. For example Zadvydas v. Davis (2001) cites Demore v. Kim (2003).

Attached is a list of 473 problem edges just from the SCOUTS network (i.e. both citing and cited cases are SCOTUS cases).

backwards_edges.txt

mlissner commented 8 years ago

Thanks. Yeah, I looked at this briefly and it seems like the code should be fine:

From match_citations.py:

    # Set up filter parameters
    if citation.year:
        start_year = end_year = citation.year
    else:
        start_year, end_year = get_years_from_reporter(citation)
        if citing_doc is not None and citing_doc.cluster.date_filed:
            end_year = min(end_year, citing_doc.cluster.date_filed.year)
    main_params['fq'].append(
        'dateFiled:%s' % build_date_range(start_year, end_year)
    )

That seems more or less OK so we'll need to dig in further.