CodeForPhilly / pbf-scraping

Project for Philadelphia Bail Fund to scrape new criminal filings from municipal court
https://codeforphilly.github.io/pbf-scraping
10 stars 4 forks source link

super long offense description in one of the dockets #68

Closed irishryoon closed 3 years ago

irishryoon commented 3 years ago

For docket MC-51-CR-0011764-2020, the parsed "offenses" column contains irrelevant information. The second to last item of the current output contains irrelevant information such as "Proceed to Court", date, F1 F2, ...,

['Rape Forcible Compulsion', 'Unlawful Contact With Minor - Sexual Offenses', 'Unlawful Restraint/ Serious Bodily Injury', 'Sexual Assault', 'Corruption Of Minors - Defendant Age 18 or Above', 'Endangering Welfare of Children - Parent/Guardian/Other Commits Offense', 'Indecent Assault Forcible Compulsion', 'Statutory Sexual Assault: 11 Years Older Rape of Child IDSI Forcible Compulsion', 'Rape of Child IDSI Forcible Compulsion Incarceration/Diversionary Period Disposition Date Offense Disposition Sentence Date 06/16/2020 Proceed to Court Proceed to Court Proceed to Court Proceed to Court Proceed to Court F1 F1 F2 F2 F3', 'IDSI Forcible Compulsion']

This issue was more common in the old version of the parsed csv file. With the new parsing, I only found one instance of the issue.

adamrlinder commented 3 years ago

Hey @bertamb are you still working on this one? I think it is the last issue related to docket parsing

bertamb commented 3 years ago

Sorry, it took me a while! But I found the problem and it is fixed now.