Open bepetersn opened 10 years ago
@bepetersn Good summary of all the caveats connected with this issue.
@hancush Can you check if parserator is appropriate for parsing ILCS statutes? Do you have to train it? How much training data does it need?
Also, I think https://github.com/sc3/python-ilcs might be a better place for this.
We've talked about switching to only allowing lookup of IUCR code with ILCS bits. If we were to make this change, it would put the responsibility on the client to do their own parsing of ILCS reference strings, parsing which we could at least try to help with.
The simplest thing would be to provide a regex with which to parse ILCS reference strings as a constant, like
ILCS_FORMAT
. Another possibility would be to expose a method which is capable of doing some parsing for the client.Problems arise in that there might be multiple formats in which ILCS data could appear, which suggests a more involved approach as opposed to providing a single regex.