USDA / USDA-APIs

Do you have feedback, ideas, or questions for USDA APIs? Use this repository's Issue Tracker to join the discussion.
www.usda.gov/developer
107 stars 16 forks source link

Inconsistency between API DOC and Field_Descriptions_Apr2020 #99

Open andreivladmatei opened 3 years ago

andreivladmatei commented 3 years ago

I'm in the process of integrating my existing recipe/ingredient database with USDA's new API.

My current intent is to store only the FDC_ID as an externalId in the recipe-ingredient list and fetch the data each time a recipe is rendered (by this avoiding a hard copy from USDA to my own DB).

I found the following documentation inconsistency:

Does anyone have a recommendation on how to store external references? The last think I need is to have recipes with missing ingredient/foods.

littlebunch commented 3 years ago

Yeah, I think FDC_ID should be called "unique permanent version identifier" -- at least for Branded Food Products. Don't know which data set you're working with, but I've found UPC/GTIN to be a reliable way to retrieve all versions of a Branded Food Product food. For cases where UPC/GTIN returns multiple versions of a food, you then need to look at the published date to get the latest version. I know the old Standard Legacy data used a NDBNo which was unique for a food. Not sure about FNDDS.

andreivladmatei commented 3 years ago

For the moment I'm using only Foundation and SR Legacy (Branded to be adopted sometime in the future).

I'm only worried about the possibility of FDC_IDs to be deleted in case of update, based on this quote: "Each time the data in a food record changes, that food item receives a new FDC_ID number. "

hphungnal commented 3 years ago

@andreimateiro I have also forwarded your question to the FDC lead. UPDATE: It seems like they're out of office until 8/18. I will try to follow up then.

andreivladmatei commented 3 years ago

Thanks !

KyleMcKillop-USDA commented 3 years ago

@andreimateiro @hphungnal That is correct. FDC ID is more an identifier of the published record than an identifier of the item. The idea was the ability to easily reference data on a record so that the values are not different if you access it at a different time. Instead each dataset has its own identifier for the food. SR and Foundation use ndb_no, FNDDS uses food_code, and branded uses the gtin_upc field as an identifier.

There was a report of older FNDDS FDC_IDs being removed, but this was not intentional and we are looking at how to add them back into the system in a future release (in the mean time those are still available in the download file from April 2019). We do not plan on removing fdc_ids of foundation and SR, instead as noted, a new record with a new FDC_ID will be inserted instead.

Let me know if there is anything I can further clarify!

andreivladmatei commented 3 years ago

Thanks, this really helps. I will close this issue now.

andreivladmatei commented 3 years ago

I'm reopening this thread as I found several FDCIDs that done't exist anymore:

e.g: 785758, 789038 and much more.

@Kyle-McKillop, based on the logic presented above, I was hoping to query the same FDC ID and find the same info, but now I get 404 like it didn't exist at all.

andreivladmatei commented 3 years ago

@hphungnal , @littlebunch, @Kyle-McKillop making sure this message reaches you.

KyleMcKillop-USDA commented 3 years ago

@andreimateiro I was out of office, so I apologize for the delay in my response. Looking into it, these IDs were for previous FNDDS cycles. It seems that in the release of the latest FNDDS Survey Foods in October, the previous cycle years were removed. This is not the intended behavior.

Currently you can find information on those previous two cycles at https://fdc.nal.usda.gov/download-datasets.html if there is urgency in retrieving those older records. I will need to look into how to get these two cycles re-added with their existing FDC IDs back into the dataset. I do not currently have a timeline for that.

What I can say for sure is: FDC IDs shouldn't be removed. As far as I can tell, this error is only for previous FNDDS cycles, which occur every two years. Our next FNDDS cycle update will not be until 2023.

andreivladmatei commented 3 years ago

Thanks for the update. The problem is I cache this IDs as "external ids" in my app (to avoid hitting your API too often or avoid the latency overhead that sometimes occurs) and in this case I probably have hundreds of orphan records. I guess other apps/websites rely on similar mechanisms.

Is there any other way I can relink them? going one by one is clear not something I can do.

KyleMcKillop-USDA commented 3 years ago

@andreimateiro . I'll ask our contractors and see if there are any thoughts. The downloads, as listed above, both have a table called "survey_fndds_food.csv" that could at least document the FDC_IDs that are currently missing. FNDDS 2015-2016, and FNDDS 2013-14 are the missing cycles.

andreivladmatei commented 3 years ago

@Kyle-McKillop , any updates here?

I would really appreciate if those entries could be re-added. In the meantime, is there any way to remap a FDCID? even manually?

For example, FDCID: 786756, with name: "Berries, frozen, NFS" is missing in the new cycles. I would expect at least to find the name with new values, but by searching the API, nothing is retrieved.

Any thoughts?

andreivladmatei commented 3 years ago

Actually, I managed to go id by id (man that was complicated) and only a few don't have correspondents in the new FNDDS DB. For the moment I'm good.

Is there a possibility to enforce some sort of protection so in the future there is no more data loss?