Open andreivladmatei opened 3 years ago
Yeah, I think FDC_ID should be called "unique permanent version identifier" -- at least for Branded Food Products. Don't know which data set you're working with, but I've found UPC/GTIN to be a reliable way to retrieve all versions of a Branded Food Product food. For cases where UPC/GTIN returns multiple versions of a food, you then need to look at the published date to get the latest version. I know the old Standard Legacy data used a NDBNo which was unique for a food. Not sure about FNDDS.
For the moment I'm using only Foundation and SR Legacy (Branded to be adopted sometime in the future).
I'm only worried about the possibility of FDC_IDs to be deleted in case of update, based on this quote: "Each time the data in a food record changes, that food item receives a new FDC_ID number. "
@andreimateiro I have also forwarded your question to the FDC lead. UPDATE: It seems like they're out of office until 8/18. I will try to follow up then.
Thanks !
@andreimateiro @hphungnal That is correct. FDC ID is more an identifier of the published record than an identifier of the item. The idea was the ability to easily reference data on a record so that the values are not different if you access it at a different time. Instead each dataset has its own identifier for the food. SR and Foundation use ndb_no, FNDDS uses food_code, and branded uses the gtin_upc field as an identifier.
There was a report of older FNDDS FDC_IDs being removed, but this was not intentional and we are looking at how to add them back into the system in a future release (in the mean time those are still available in the download file from April 2019). We do not plan on removing fdc_ids of foundation and SR, instead as noted, a new record with a new FDC_ID will be inserted instead.
Let me know if there is anything I can further clarify!
Thanks, this really helps. I will close this issue now.
I'm reopening this thread as I found several FDCIDs that done't exist anymore:
e.g: 785758, 789038 and much more.
@Kyle-McKillop, based on the logic presented above, I was hoping to query the same FDC ID and find the same info, but now I get 404 like it didn't exist at all.
@hphungnal , @littlebunch, @Kyle-McKillop making sure this message reaches you.
@andreimateiro I was out of office, so I apologize for the delay in my response. Looking into it, these IDs were for previous FNDDS cycles. It seems that in the release of the latest FNDDS Survey Foods in October, the previous cycle years were removed. This is not the intended behavior.
Currently you can find information on those previous two cycles at https://fdc.nal.usda.gov/download-datasets.html if there is urgency in retrieving those older records. I will need to look into how to get these two cycles re-added with their existing FDC IDs back into the dataset. I do not currently have a timeline for that.
What I can say for sure is: FDC IDs shouldn't be removed. As far as I can tell, this error is only for previous FNDDS cycles, which occur every two years. Our next FNDDS cycle update will not be until 2023.
Thanks for the update. The problem is I cache this IDs as "external ids" in my app (to avoid hitting your API too often or avoid the latency overhead that sometimes occurs) and in this case I probably have hundreds of orphan records. I guess other apps/websites rely on similar mechanisms.
Is there any other way I can relink them? going one by one is clear not something I can do.
@andreimateiro . I'll ask our contractors and see if there are any thoughts. The downloads, as listed above, both have a table called "survey_fndds_food.csv" that could at least document the FDC_IDs that are currently missing. FNDDS 2015-2016, and FNDDS 2013-14 are the missing cycles.
@Kyle-McKillop , any updates here?
I would really appreciate if those entries could be re-added. In the meantime, is there any way to remap a FDCID? even manually?
For example, FDCID: 786756, with name: "Berries, frozen, NFS" is missing in the new cycles. I would expect at least to find the name with new values, but by searching the API, nothing is retrieved.
Any thoughts?
Actually, I managed to go id by id (man that was complicated) and only a few don't have correspondents in the new FNDDS DB. For the moment I'm good.
Is there a possibility to enforce some sort of protection so in the future there is no more data loss?
I'm in the process of integrating my existing recipe/ingredient database with USDA's new API.
My current intent is to store only the FDC_ID as an externalId in the recipe-ingredient list and fetch the data each time a recipe is rendered (by this avoiding a hard copy from USDA to my own DB).
I found the following documentation inconsistency:
on the usda.gov/help page, the FDC_ID number is described as: "Each time the data in a food record changes, that food item receives a new FDC_ID number." I understand by this that this can't be used as a foreign key as it changes.
on the Download_Field_Descriptions_Apr2020.pdf file, under foods, FDC_ID number is described as: "Unique permanent identifier of the food"
Does anyone have a recommendation on how to store external references? The last think I need is to have recipes with missing ingredient/foods.