usgpo / api

services to access govinfo content and metadata
https://api.govinfo.gov
Other
184 stars 58 forks source link

null date fields in BILLS files #82

Open evan-benoit opened 3 years ago

evan-benoit commented 3 years ago

Hello! I'm a Senior Engineering Manager at FiscalNote, managing our Congressional Quarterly product. My engineers are making extensive use of the GovInfo API (https://api.govinfo.gov/docs/) to download Bill Texts, Bill Status, Committee Reports, Federal Registers, and Congressional Records. It's a great system, very well-documented!

One thing we've noticed is that the Bill Text XML files don't always have data in the . Most bill versions do, but a small percentage of them don't. Do you know why that might be? Is this a bug on your side? Or is that expected behavior?

Some examples of XML files that are missing data in the field:

On the other hand, here's an example of one that does have a date:

Any guidance would be appreciated. Thanks!

jonquandt commented 3 years ago

@evan-benoit I'm checking into this to give you a more definitive answer.

In the mean-time, you may be interested in using the dateIssued value from the package summary, mods, or pulling from the associated BILLSTATUS xml for the given bill version.

jonquandt commented 3 years ago

Based on some additional investigation, it appears that this occurs when the source Bill XML does not include an tag that includes a date attribute. there are some minor additions to fill in the Dublin core metadata when GPO receives the files.

From the four examples above, only the last one has the action-date tag with a date attribute.

https://www.govinfo.gov/content/pkg/BILLS-116s1342es/xml/BILLS-116s1342es.xml https://www.govinfo.gov/content/pkg/BILLS-116hjres28eh/xml/BILLS-116hjres28eh.xml Do not have action-date tags

https://www.govinfo.gov/content/pkg/BILLS-116hr1754rds/xml/BILLS-116hr1754rds.xml Has an action-date, but no date attribute