Closed evan-benoit closed 1 year ago
@evan-benoit - updated your comment to include code fencing around the tags.
I'm looking into this. If you can provide a few additional example IDs, that will help me to investigate. My initial thinking is that this is in the source data.
Sure, here's a few other examples, all with unmatched <p>
tags
I'm finding this problem in about ~2% of the BILLSTATUS documents.
Thank you -- the team that helps supply this is aware of the issue and working to address it by replacing a legacy system. I don't know the exact timeline for this to be completed.
Thanks, I appreciate the speedy response!
As an update, this is still in work upstream of us. This is being tracked by the Library of Congress here: https://github.com/LibraryOfCongress/api.congress.gov/issues/2
I am closing the issue here because it will end up being resolved upstream and then we will update our BILLSTATUS and BILLSUM files.
Hello GovInfo! We're seeing about 2% of the BILLSTATUS documents that we examine have faulty HTML in the
<billSummaries>
section. For example:https://www.govinfo.gov/bulkdata/BILLSTATUS/117/s/BILLSTATUS-117s294.xml
The
<billSummaries>
section has unclosed<p>
tags. Some of the<p>
tags have corresponding</p>
tag, but others do not. Any idea why this is? Can anything be done about it?Thanks! -Evan