USDA / USDA-APIs

Do you have feedback, ideas, or questions for USDA APIs? Use this repository's Issue Tracker to join the discussion.
www.usda.gov/developer
107 stars 16 forks source link

Inconsistency in foundation_food mapping for April 2022 dataset #120

Closed hhimanshu closed 1 year ago

hhimanshu commented 1 year ago

Hello,

I downloaded the CSV files for (wrongly posted: Oct 2022, actually meant April 2022) dataset from official download page. I am trying to find the unique permanent id of a food that is stable across new versions of FDC datasets.

My current thinking was that fdc_id is the unique_identifer that is permanent and stable. But, just now, I stumbled on this comment from @Kyle-McKillop that says

FDC ID is more an identifier of the published record than an identifier of the item. The idea was the ability to easily reference data on a record so that the values are not different if you access it at a different time. Instead each dataset has its own identifier for the food. SR and Foundation use ndb_no, FNDDS uses food_code, and branded uses the gtin_upc field as an identifier.

So, based on that I compared the food entries for each data source in FDC dataset, namely, foundation_food, sr_legacy, and branded_food. What I found was that each datasource has 2 set of files - food.csv and <datasource_name>_food.csv. To my surprise, all except foundation_food seems to have equal number of entries.

➜  FoodData_Central_survey_food_csv_2020-10-30 wc -l food.csv survey_fndds_food.csv 
    7084 food.csv
    7084 survey_fndds_food.csv
   14168 total
➜  FoodData_Central_survey_food_csv_2020-10-30 cd ../FoodData_Central_branded_food_csv_2022-04-28 
➜  FoodData_Central_branded_food_csv_2022-04-28 wc -l food.csv branded_food.csv                   
 1628971 food.csv
 1628971 branded_food.csv
 3257942 total
➜  FoodData_Central_branded_food_csv_2022-04-28 cd ../FoodData_Central_sr_legacy_food_csv_\ 2019-04-02 
➜  FoodData_Central_sr_legacy_food_csv_ 2019-04-02 wc -l food.csv sr_legacy_food.csv                     
    7794 food.csv
    7794 sr_legacy_food.csv
   15588 total
➜  FoodData_Central_sr_legacy_food_csv_ 2019-04-02 cd ../FoodData_Central_foundation_food_csv_2022-04-28 
➜  FoodData_Central_foundation_food_csv_2022-04-28 wc -l food.csv foundation_food.csv                    
   43448 food.csv
     185 foundation_food.csv
   43633 total

Based on that I have following questions

  1. Is the mapping incomplete for foundation_food? Or we have more data in food.csv under foundation_food dataset?
  2. For a permanent unique identification, are following mappings correct? Which means if new versions of datasets are published by FDC, the food with these Unique Identifiers will still be the same?
Data Source Unique Identifier
FNDDS food_code
SR Legacy ndb_no
Foundation Foods ndb_no
Branded Foods gtin_upc
KyleMcKillop-USDA commented 1 year ago

Edit: removed quoted information

Thanks for your questions.

First a clarification. You mention an October 2022, which has not been released yet. Given the record counts given, I am assuming you are referring to the April 2022 files.

Based on that I have following questions

  1. Is the mapping incomplete for foundation_food? Or we have more data in food.csv under foundation_food dataset?

    • The data are correct. Foundation foods are different because of the layer of metadata that underlies each foundation food. Foundation_food profile is made up of sample_foods Sample_foods are made up of sub_samples_foods (the sub_samples they are sent the lab for analysis and are linked to sub_sample_results). There is also a design pattern around acquisition samples that represent the food as acquired (from agriculture or purchased from market), these acquired foods become sample_food. This pattern exists to accommodate some 2019 foundatation food multiple acquisitions that were composited together into one sample for analysis. After 2019, there is a one-to-one relationship between sample_foods and acquisition_foods This causes a much larger count of foods than the total number of foundation foods available. If you are only interested in the summarized results for a foundation food, then only use ids found within the foundation_food table.
  2. For a permanent unique identification, are following mappings correct? Which means if new versions of datasets are published by FDC, the food with these Unique Identifiers will still be the same?

    • Correct with one small caveat, the system currently treats branded identifiers as a pairing between source [GDSN or LI] and gtin_upc.
hhimanshu commented 1 year ago

@Kyle-McKillop , first of all, I apologize for my stupidity on typing wrong dataset name, come on, OCT 2022 has not even arrived yet ;-). I have fixed the title, and description now. Next, THANK YOU for such a detailed response. I appreciate your words, they helped me understand the dataset a bit better.

Now, I will confirm what I understood from your response -

  1. The total number of foundation_food are 185, where NDB_number is the unique number.

    ➜  usda wc -l FoodData_Central_foundation_food_csv_2022-04-28/foundation_food.csv 
     185 FoodData_Central_foundation_food_csv_2022-04-28/foundation_food.csv
  2. The permanent unique id for the food item depends on which dataset we are looking at. Specifically,

Foundation Foods

permanent unique id is NDB_number

➜  usda head -2 FoodData_Central_foundation_food_csv_2022-04-28/foundation_food.csv 
"fdc_id","NDB_number","footnote"
"321358","16158",""

SR Legacy

permanent unique id is NDB_number

➜  usda head -2 FoodData_Central_sr_legacy_food_csv_\ 2019-04-02/sr_legacy_food.csv 
"fdc_id","NDB_number"
"167512","18634"

Survey FNDDS

permanent unique id is food_code

➜  usda head -2 FoodData_Central_survey_food_csv_2020-10-30/survey_fndds_food.csv 
"fdc_id","food_code","wweia_category_number","start_date","end_date"
"1097510","11000000","9602","2017-01-01","2018-12-31"

Branded Foods

permanent unique id is gtin_upc

➜  usda head -2 FoodData_Central_branded_food_csv_2022-04-28/branded_food.csv 
"fdc_id","brand_owner","brand_name","subbrand_name","gtin_upc","ingredients","not_a_significant_source_of","serving_size","serving_size_unit","household_serving_fulltext","branded_food_category","data_source","package_weight","modified_date","available_date","market_country","discontinued_date","preparation_state_code","trade_channel"
"1105904","Richardson Oilseed Products (US) Limited","","","00027000612323","Vegetable Oil","","15.0","ml","","Oils Edible","GDSN","","2020-10-02","2020-11-13","United States","","",""

Does all the sounds right @Kyle-McKillop? Also, what do you mean when you say

Correct with one small caveat, the system currently treats branded identifiers as a pairing between source [GDSN or LI] and gtin_upc.

Can you explain that in a bit more detail? I am new to this dataset and would appreciate your help

hhimanshu commented 1 year ago

Another (potential) issue, or my mistake could be that I am trying to join all 185 foundation foods by merging the 2 files - food.csv, and foundation_food.csv. It seems that not all ids in foundation_food are available in food.

Here is what I did

➜  FoodData_Central_foundation_food_csv_2022-04-28 head food.csv foundation_food.csv              
==> food.csv <==
"fdc_id","data_type","description","food_category_id","publication_date"
"319874","sample_food","HUMMUS, SABRA CLASSIC","16","2019-04-01"
"319875","market_acquisition","HUMMUS, SABRA CLASSIC","16","2019-04-01"
"319876","market_acquisition","HUMMUS, SABRA CLASSIC","16","2019-04-01"
"319877","sub_sample_food","Hummus","16","2019-04-01"
"319878","sub_sample_food","Hummus","16","2019-04-01"
"319879","sample_food","HUMMUS, SABRA CLASSIC","16","2019-04-01"
"319880","market_acquisition","HUMMUS, SABRA CLASSIC","16","2019-04-01"
"319881","market_acquisition","HUMMUS, SABRA CLASSIC","16","2019-04-01"
"319882","sub_sample_food","Hummus","16","2019-04-01"

==> foundation_food.csv <==
"fdc_id","NDB_number","footnote"
"321358","16158",""
"321360","100147",""
"321611","11056",""
"323121","7022",""
"323294","12563"," Other phytosterols = 34.67 mg/100g"
"323505","11233",""
"323604","1171",""
"323697","1172",""
"323793","1173",""

# join the two files
➜  FoodData_Central_foundation_food_csv_2022-04-28 join -t, -1 1 -2 1 foundation_food.csv food.csv | wc -l
     113

So 72 = 185-113 foundation_food are have no mapping in food.csv

hhimanshu commented 1 year ago

Okay, so I spent some more time investigating if there are any issues with the data, or I am making mistake (well I was!). I installed q library to run SQL on CSV files. Here is what I did

Count number of foundation foods

➜  FoodData_Central_foundation_food_csv_2022-04-28 q 'select count(*) from foundation_food.csv'                                             
185

Count number of food in Foundation Food directory

➜  FoodData_Central_foundation_food_csv_2022-04-28 q 'select count(*) from food.csv'                                                        
43448

This is where samples and sub_samples counts are bloating the total number

Join the two files based on the fdc_id and create a composite dataset (so that NDB_number) can be captured and correct foundation foods are selected

➜  FoodData_Central_foundation_food_csv_2022-04-28 q -d , 'select ff.*, f.* from foundation_food.csv ff JOIN food.csv f ON (ff.c1 = f.c1)' | wc -l
Warning - There seems to be header line in the file, but -H has not been specified. All fields will be detected as text fields, and the header line will appear as part of the data
Warning - There seems to be header line in the file, but -H has not been specified. All fields will be detected as text fields, and the header line will appear as part of the data
     185

Same query as above, but show all data

➜  FoodData_Central_foundation_food_csv_2022-04-28 q -d , 'select ff.*, f.* from foundation_food.csv ff JOIN food.csv f ON (ff.c1 = f.c1)'        
Warning - There seems to be header line in the file, but -H has not been specified. All fields will be detected as text fields, and the header line will appear as part of the data
Warning - There seems to be header line in the file, but -H has not been specified. All fields will be detected as text fields, and the header line will appear as part of the data
fdc_id,NDB_number,footnote,fdc_id,data_type,description,food_category_id,publication_date
321358,16158,,321358,foundation_food,"Hummus, commercial",16,2019-04-01
321360,100147,,321360,foundation_food,"Tomatoes, grape, raw",11,2019-04-01
321611,11056,,321611,foundation_food,"Beans, snap, green, canned, regular pack, drained solids",11,2019-04-01
323121,7022,,323121,foundation_food,"Frankfurter, beef, unheated",7,2019-04-01
323294,12563, Other phytosterols = 34.67 mg/100g,323294,foundation_food,"Nuts, almonds, dry roasted, with salt added",12,2019-04-01
323505,11233,,323505,foundation_food,"Kale, raw",11,2019-04-01
323604,1171,,323604,foundation_food,"Egg, whole, raw, frozen, pasteurized",1,2019-04-01
323697,1172,,323697,foundation_food,"Egg, white, raw, frozen, pasteurized",1,2019-04-01
323793,1173,,323793,foundation_food,"Egg, white, dried",1,2019-04-01
324317,11296,,324317,foundation_food,"Onion rings, breaded, par fried, frozen, prepared, heated in oven",11,2019-04-01
324653,11937,,324653,foundation_food,"Pickles, cucumber, dill or kosher dill",11,2019-04-01
325036,1032,,325036,foundation_food,"Cheese, parmesan, grated",1,2019-04-01
325198,1042,,325198,foundation_food,"Cheese, pasteurized process, American, vitamin D fortified",1,2019-04-01
325287,9123,,325287,foundation_food,"Grapefruit juice, white, canned or bottled, unsweetened",9,2019-04-01
325430,9236,,325430,foundation_food,"Peaches, yellow, raw",9,2019-04-01
325524,12537,,325524,foundation_food,"Seeds, sunflower seed kernels, dry roasted, with salt added",12,2019-04-01
325871,18069,,325871,foundation_food,"Bread, white, commercially prepared",18,2019-04-01
326196,11236,,326196,foundation_food,"Kale, frozen, cooked, boiled, drained, without salt",11,2019-04-01
326698,2046,,326698,foundation_food,"Mustard, prepared, yellow",2,2019-04-01
327046,9148,,327046,foundation_food,"Kiwifruit, green, raw",9,2019-04-01
327357,9191,,327357,foundation_food,"Nectarines, raw",9,2019-04-01
328637,1009,,328637,foundation_food,"Cheese, cheddar",1,2019-04-01
328841,1015,,328841,foundation_food,"Cheese, cottage, lowfat, 2% milkfat",1,2019-04-01
329370,1029,,329370,foundation_food,"Cheese, mozzarella, low moisture, part-skim",1,2019-04-01
329490,1133,,329490,foundation_food,"Egg, whole, dried",1,2019-04-01
329596,1126,,329596,foundation_food,"Egg, yolk, raw, frozen, pasteurized",1,2019-04-01
329716,1137,,329716,foundation_food,"Egg, yolk, dried",1,2019-04-01
330137,1256,,330137,foundation_food,"Yogurt, Greek, plain, nonfat",1,2019-04-01
330415,1285,,330415,foundation_food,"Yogurt, Greek, strawberry, nonfat",1,2019-04-01
330458,4047,,330458,foundation_food,"Oil, coconut",4,2019-04-01
331897,5671,,331897,foundation_food,"Chicken, broilers or fryers, drumstick, meat only, cooked, braised",5,2019-04-01
331960,5746,,331960,foundation_food,"Chicken, broiler or fryers, breast, skinless, boneless, meat only, cooked, braised",5,2019-04-01
332282,6931,,332282,foundation_food,"Sauce, pasta, spaghetti/marinara, ready-to-serve",6,2019-04-01
332397,7028,,332397,foundation_food,"Ham, sliced, pre-packaged, deli meat (96%fat free, water added)",7,2019-04-01
332791,9533,,332791,foundation_food,"Olives, green, Manzanilla, stuffed with pimiento",9,2019-04-01
333008,100192,,333008,foundation_food,"Cookies, oatmeal, soft, with raisins",18,2019-04-01
333281,100193,,333281,foundation_food,"Tomatoes, canned, red, ripe, diced",11,2019-04-01
333374,15033,May contain additives to retain moisture,333374,foundation_food,"Fish, haddock, raw",15,2019-04-01
333476,15066,May contain additives to retain moisture,333476,foundation_food,"Fish, pollock, raw",15,2019-04-01
334194,15121,,334194,foundation_food,"Fish, tuna, light, canned in water, drained solids",15,2019-04-01
334536,36602,,334536,foundation_food,"Restaurant, Chinese, fried rice, without meat",25,2019-04-01
334628,36412,,334628,foundation_food,"Restaurant, Latino, tamale, pork",25,2019-04-01
334720,36408,,334720,foundation_food,"Restaurant, Latino, pupusas con frijoles (pupusas, bean)",25,2019-04-01
335240,18075,,335240,foundation_food,"Bread, whole-wheat, commercially prepared",18,2019-04-01
746758,23377,Source number reflects the actual number of samples analyzed for a nutrient. Repeat nutrient analyses may have been done on the same sample with the values shown.,746758,foundation_food,"Beef, loin, tenderloin roast, separable lean only, boneless, trimmed to 0"" fat, select, cooked, roasted",13,2019-12-16
746759,23385,Source number reflects the actual number of samples analyzed for a nutrient. Repeat nutrient analyses may have been done on the same sample with the values shown.,746759,foundation_food,"Beef, loin, top loin steak, boneless, lip-on, separable lean only, trimmed to 1/8"" fat, choice, raw",13,2019-12-16
746760,23362,Source number reflects the actual number of samples analyzed for a nutrient. Repeat nutrient analyses may have been done on the same sample with the values shown.,746760,foundation_food,"Beef, round, eye of round roast, boneless, separable lean only, trimmed to 0"" fat, select, raw",13,2019-12-16
746761,23359,Source number reflects the actual number of samples analyzed for a nutrient. Repeat nutrient analyses may have been done on the same sample with the values shown.,746761,foundation_food,"Beef, round, top round roast, boneless, separable lean only, trimmed to 0"" fat, select, raw",13,2019-12-16
746762,13468,Source number reflects the actual number of samples analyzed for a nutrient. Repeat nutrient analyses may have been done on the same sample with the values shown.,746762,foundation_food,"Beef, short loin, porterhouse steak, separable lean only, trimmed to 1/8"" fat, select, raw",13,2019-12-16
746763,13236,Source number reflects the actual number of samples analyzed for a nutrient. Repeat nutrient analyses may have been done on the same sample with the values shown.,746763,foundation_food,"Beef, short loin, t-bone steak, bone-in, separable lean only, trimmed to 1/8"" fat, choice, cooked, grilled",13,2019-12-16
746764,11130,Source number reflects the actual number of samples analyzed for a nutrient. Repeat nutrient analyses may have been done on the same sample with the values shown.,746764,foundation_food,"Carrots, frozen, unprepared",11,2019-12-16
746765,1227,Source number reflects the actual number of samples analyzed for a nutrient. Repeat nutrient analyses may have been done on the same sample with the values shown.,746765,foundation_food,"Cheese, dry white, queso seco",1,2019-12-16
746766,1036,Source number reflects the actual number of samples analyzed for a nutrient. Repeat nutrient analyses may have been done on the same sample with the values shown.,746766,foundation_food,"Cheese, ricotta, whole milk",1,2019-12-16
746767,1040,Source number reflects the actual number of samples analyzed for a nutrient. Repeat nutrient analyses may have been done on the same sample with the values shown.,746767,foundation_food,"Cheese, swiss",1,2019-12-16
746768,9094,Mission variety. Source number reflects the actual number of samples analyzed for a nutrient. Repeat nutrient analyses may have been done on the same sample with the values shown.,746768,foundation_food,"Figs, dried, uncooked",9,2019-12-16
746769,11251,Source number reflects the actual number of samples analyzed for a nutrient. Repeat nutrient analyses may have been done on the same sample with the values shown.,746769,foundation_food,"Lettuce, cos or romaine, raw",11,2019-12-16
746770,9181,Source number reflects the actual number of samples analyzed for a nutrient. Repeat nutrient analyses may have been done on the same sample with the values shown.,746770,foundation_food,"Melons, cantaloupe, raw",9,2019-12-16
746771,9202,Source number reflects the actual number of samples analyzed for a nutrient. Repeat nutrient analyses may have been done on the same sample with the values shown.,746771,foundation_food,"Oranges, raw, navels",9,2019-12-16
746772,1082,Source number reflects the actual number of samples analyzed for a nutrient. Repeat nutrient analyses may have been done on the same sample with the values shown.,746772,foundation_food,"Milk, lowfat, fluid, 1% milkfat, with added vitamin A and vitamin D",1,2019-12-16
746773,9412,Source number reflects the actual number of samples analyzed for a nutrient. Repeat nutrient analyses may have been done on the same sample with the values shown.,746773,foundation_food,"Pears, raw, bartlett",9,2019-12-16
746774,36622,Rice was not included in analyses. Ingredients and amount of breading and sauce vary by restaurant. Source number reflects the actual number of samples analyzed for a nutrient. Repeat nutrient analyses may have been done on the same sample with the value,746774,foundation_food,"Restaurant, Chinese, sweet and sour pork",25,2019-12-16
746775,2047,Source number reflects the actual number of samples analyzed for a nutrient. Repeat nutrient analyses may have been done on the same sample with the values shown.,746775,foundation_food,"Salt, table, iodized",2,2019-12-16
746776,1085,Source number reflects the actual number of samples analyzed for a nutrient. Repeat nutrient analyses may have been done on the same sample with the values shown.,746776,foundation_food,"Milk, nonfat, fluid, with added vitamin A and vitamin D (fat free or skim)",1,2019-12-16
746777,6164,Source number reflects the actual number of samples analyzed for a nutrient. Repeat nutrient analyses may have been done on the same sample with the values shown.,746777,foundation_food,"Sauce, salsa, ready-to-serve",6,2019-12-16
746778,1079,Source number reflects the actual number of samples analyzed for a nutrient. Repeat nutrient analyses may have been done on the same sample with the values shown.,746778,foundation_food,"Milk, reduced fat, fluid, 2% milkfat, with added vitamin A and vitamin D",1,2019-12-16
746779,7954,Source number reflects the actual number of samples analyzed for a nutrient. Repeat nutrient analyses may have been done on the same sample with the values shown.,746779,foundation_food,"Sausage, breakfast sausage, beef, pre-cooked, unprepared",7,2019-12-16
746780,7089,Source number reflects the actual number of samples analyzed for a nutrient. Repeat nutrient analyses may have been done on the same sample with the values shown.,746780,foundation_food,"Sausage, Italian, pork, mild, cooked, pan-fried",7,2019-12-16
746781,100173,Source number reflects the actual number of samples analyzed for a nutrient. Repeat nutrient analyses may have been done on the same sample with the values shown.,746781,foundation_food,"Sausage, pork, chorizo, link or ground, cooked, pan-fried",7,2019-12-16
746782,1077,Source number reflects the actual number of samples analyzed for a nutrient. Repeat nutrient analyses may have been done on the same sample with the values shown.,746782,foundation_food,"Milk, whole, 3.25% milkfat, with added vitamin D",1,2019-12-16
746783,7919,Source number reflects the actual number of samples analyzed for a nutrient. Repeat nutrient analyses may have been done on the same sample with the values shown.,746783,foundation_food,"Sausage, turkey, breakfast links, mild, raw",7,2019-12-16
746784,19335,Source number reflects the actual number of samples analyzed for a nutrient. Repeat nutrient analyses may have been done on the same sample with the values shown.,746784,foundation_food,"Sugars, granulated",19,2019-12-16
746785,5666,Source number reflects the actual number of samples analyzed for a nutrient. Repeat nutrient analyses may have been done on the same sample with the values shown.,746785,foundation_food,"Turkey, ground, 93% lean, 7% fat, pan-broiled crumbles",5,2019-12-16
746952,7129,,746952,foundation_food,"Ham, sliced, restaurant",10,2019-12-16
747429,1062,,747429,foundation_food,"Cheese, American, restaurant",1,2019-12-16
747430,16500,,747430,foundation_food,"Beans, Dry, Medium Red (0% moisture)",16,2019-12-16
747431,16501,,747431,foundation_food,"Beans, Dry, Red (0% moisture)",16,2019-12-16
747432,16502,,747432,foundation_food,"Beans, Dry, Flor de Mayo (0% moisture)",16,2019-12-16
747433,16503,,747433,foundation_food,"Beans, Dry, Brown (0% moisture)",16,2019-12-16
747434,16504,,747434,foundation_food,"Beans, Dry, Tan (0% moisture)",16,2019-12-16
747435,16505,,747435,foundation_food,"Beans, Dry, Light Tan (0% moisture)",16,2019-12-16
747436,16506,,747436,foundation_food,"Beans, Dry, Carioca (0% moisture)",16,2019-12-16
747437,16507,,747437,foundation_food,"Beans, Dry, Cranberry (0% moisture)",16,2019-12-16
747438,16508,,747438,foundation_food,"Beans, Dry, Light Red Kidney (0% moisture)",16,2019-12-16
747439,16509,,747439,foundation_food,"Beans, Dry, Pink (0% moisture)",16,2019-12-16
747440,16510,,747440,foundation_food,"Beans, Dry, Dark Red Kidney (0% moisture)",16,2019-12-16
747441,16511,,747441,foundation_food,"Beans, Dry, Navy (0% moisture)",16,2019-12-16
747442,16512,,747442,foundation_food,"Beans, Dry, Small White (0% moisture)",16,2019-12-16
747443,16513,,747443,foundation_food,"Beans, Dry, Small Red (0% moisture)",16,2019-12-16
747444,16514,,747444,foundation_food,"Beans, Dry, Black (0% moisture)",16,2019-12-16
747445,16515,,747445,foundation_food,"Beans, Dry, Pinto (0% moisture)",16,2019-12-16
747446,16516,,747446,foundation_food,"Beans, Dry, Great Northern (0% moisture)",16,2019-12-16
747447,11090,Source number reflects the actual number of samples analyzed for a nutrient. Repeat nutrient analyses may have been done on the same sample with the values shown.,747447,foundation_food,"Broccoli, raw",11,2019-12-16
747693,11966,,747693,foundation_food,"Ketchup, restaurant",11,2019-12-16
747997,1124,,747997,foundation_food,"Eggs, Grade A, Large, egg white",1,2019-12-16
748236,1125,,748236,foundation_food,"Eggs, Grade A, Large, egg yolk",1,2019-12-16
748278,4582,,748278,foundation_food,"Oil, canola",4,2019-12-16
748323,4518,,748323,foundation_food,"Oil, corn",4,2019-12-16
748366,4044,,748366,foundation_food,"Oil, soybean",4,2019-12-16
748608,4063,,748608,foundation_food,"Oil, olive, extra virgin",4,2019-12-16
748967,1123,,748967,foundation_food,"Eggs, Grade A, Large, egg whole",1,2019-12-16
749420,10896,,749420,foundation_food,"Pork, cured, bacon, cooked, restaurant",10,2019-12-16
789828,1145,,789828,foundation_food,"Butter, stick, unsalted",1,2020-04-01
789890,20081,,789890,foundation_food,"Flour, wheat, all-purpose, enriched, bleached",20,2020-04-01
789951,20581,,789951,foundation_food,"Flour, wheat, all-purpose, enriched, unbleached",20,2020-04-01
790018,20481,,790018,foundation_food,"Flour, wheat, all-purpose, unenriched, unbleached",20,2020-04-01
790085,20080,,790085,foundation_food,"Flour, whole wheat, unenriched",20,2020-04-01
790146,20083,,790146,foundation_food,"Flour, bread, white, enriched, unbleached",20,2020-04-01
790214,20061,,790214,foundation_food,"Flour, rice, white, unenriched",20,2020-04-01
790276,100251,,790276,foundation_food,"Flour, corn, yellow, fine meal, enriched",20,2020-04-01
790508,1001,,790508,foundation_food,"Butter, stick, salted",1,2020-04-01
790577,100252,,790577,foundation_food,"Onions, red, raw",11,2020-04-01
790646,100253,,790646,foundation_food,"Onions, yellow, raw",11,2020-04-01
1104647,11215,,1104647,foundation_food,"Garlic, raw",11,2020-10-30
1104705,16117,,1104705,foundation_food,"Flour, soy, defatted",16,2020-10-30
1104766,16115,,1104766,foundation_food,"Flour, soy, full-fat",16,2020-10-30
1104812,20090,,1104812,foundation_food,"Flour, rice, brown",20,2020-10-30
1104913,100256,,1104913,foundation_food,"Flour, pastry, unenriched, unbleached",20,2020-10-30
1104962,100255,,1104962,foundation_food,"Onions, white, raw",11,2020-10-30
1105073,100254,,1105073,foundation_food,"Bananas, overripe, raw",9,2020-04-01
1105314,9040,,1105314,foundation_food,"Bananas, ripe and slightly ripe, raw",9,2020-04-01
1750339,9500,,1750339,foundation_food,"Apples, red delicious, with skin, raw",9,2020-10-30
1750340,9504,,1750340,foundation_food,"Apples, fuji, with skin, raw",9,2020-10-30
1750341,9503,,1750341,foundation_food,"Apples, gala, with skin, raw",9,2020-10-30
1750342,9502,,1750342,foundation_food,"Apples, granny smith, with skin, raw",9,2020-10-30
1750343,9501,,1750343,foundation_food,"Apples, honeycrisp, with skin, raw",9,2020-10-30
1750348,4042,,1750348,foundation_food,"Oil, peanut",4,2021-04-28
1750349,100262,,1750349,foundation_food,"Oil, sunflower",4,2021-04-28
1750350,4511,,1750350,foundation_food,"Oil, safflower",4,2021-04-28
1750351,100258,,1750351,foundation_food,"Oil, olive, extra light",4,2021-04-28
1999626,100259,,1999626,foundation_food,"Mushroom, lion's mane",11,2021-10-28
1999627,11987,,1999627,foundation_food,"Mushroom, oyster",11,2021-10-28
1999628,11238,,1999628,foundation_food,"Mushrooms, shiitake",11,2021-10-28
1999629,11260,,1999629,foundation_food,"Mushrooms, white button",11,2021-10-28
1999630,16222,,1999630,foundation_food,"Soy milk, unsweetened, plain, shelf stable",16,2021-10-28
1999631,14091,,1999631,foundation_food,"Almond milk, unsweetened, plain, shelf stable",14,2021-10-28
1999632,100260,,1999632,foundation_food,"Spinach, baby",11,2021-10-28
1999633,11457,,1999633,foundation_food,"Spinach, mature",11,2021-10-28
1999634,100261,,1999634,foundation_food,"Tomato, roma",11,2021-10-28
2003586,100268,,2003586,foundation_food,"Flour, 00",20,2021-10-28
2003587,20140,,2003587,foundation_food,"Flour, spelt, whole grain",20,2021-10-28
2003588,100269,,2003588,foundation_food,"Flour, semolina, coarse and semi-coarse",20,2021-10-28
2003589,100270,,2003589,foundation_food,"Flour, semolina, fine",20,2021-10-28
2003590,9400,,2003590,foundation_food,"Apple juice, with added vitamin C, from concentrate, shelf stable",9,2021-10-28
2003591,9209,,2003591,foundation_food,"Orange juice, no pulp, not fortified, from concentrate, refrigerated",9,2021-10-28
2003592,9130,,2003592,foundation_food,"Grape juice, purple, with added vitamin C, from concentrate, shelf stable",9,2021-10-28
2003593,100266,,2003593,foundation_food,"Grape juice, white, with added vitamin C, from concentrate, shelf stable",9,2021-10-28
2003594,43382,,2003594,foundation_food,"Cranberry juice, not fortified, from concentrate, shelf stable",9,2021-10-28
2003595,100267,,2003595,foundation_food,"Grapefruit juice, red, not fortified, not from concentrate, refrigerated",9,2021-10-28
2003596,11540,,2003596,foundation_food,"Tomato juice, with added ingredients, from concentrate, shelf stable",9,2021-10-28
2003597,9206,,2003597,foundation_food,"Orange juice, no pulp, not fortified, not from concentrate, refrigerated",9,2021-10-28
2003598,11243,,2003598,foundation_food,"Mushroom, portabella",11,2021-10-28
2003599,100263,,2003599,foundation_food,"Mushroom, king oyster",11,2021-10-28
2003600,11950,,2003600,foundation_food,"Mushroom, enoki",11,2021-10-28
2003601,11266,,2003601,foundation_food,"Mushroom, crimini",11,2021-10-28
2003602,11993,,2003602,foundation_food,"Mushroom, maitake",11,2021-10-28
2003603,100264,,2003603,foundation_food,"Mushroom, beech",11,2021-10-28
2003604,100265,,2003604,foundation_food,"Mushroom, pioppini",11,2021-10-28
2257044,100275,,2257044,foundation_food,"Soy milk, unsweetened, plain, refrigerated",16,2022-04-28
2257045,100276,,2257045,foundation_food,"Almond milk, unsweetened, plain, refrigerated",14,2022-04-28
2257046,100277,,2257046,foundation_food,"Oat milk, unsweetened, plain, refrigerated",14,2022-04-28
2258586,11124,,2258586,foundation_food,"Carrots, mature, raw",11,2022-04-28
2258587,11960,,2258587,foundation_food,"Carrots, baby, raw",11,2022-04-28
2258588,11333,,2258588,foundation_food,"Peppers, bell, green, raw",11,2022-04-28
2258589,11951,,2258589,foundation_food,"Peppers, bell, yellow, raw",11,2022-04-28
2258590,11821,,2258590,foundation_food,"Peppers, bell, red, raw",11,2022-04-28
2258591,100278,,2258591,foundation_food,"Peppers, bell, orange, raw",11,2022-04-28
2259792,1088,,2259792,foundation_food,"Buttermilk, low fat",1,2022-04-28
2259793,1116,,2259793,foundation_food,"Yogurt, plain, whole milk",1,2022-04-28
2259794,1293,,2259794,foundation_food,"Yogurt, Greek, plain, whole milk",1,2022-04-28
2259795,100272,,2259795,foundation_food,"Cheese, parmesan, grated, refrigerated",1,2022-04-28
2259796,1019,,2259796,foundation_food,"Cheese, feta, whole milk, crumbled",1,2022-04-28
2261420,100273,,2261420,foundation_food,"Flour, almond",12,2022-04-28
2261421,100274,,2261421,foundation_food,"Flour, oat, whole grain",20,2022-04-28
2261422,11413,,2261422,foundation_food,"Flour, potato",11,2022-04-28
2262072,16098,,2262072,foundation_food,"Peanut butter, creamy",16,2022-04-28
2262073,100271,,2262073,foundation_food,"Sesame butter, creamy",12,2022-04-28
2262074,12195,,2262074,foundation_food,"Almond butter, creamy",12,2022-04-28
2262075,12220,,2262075,foundation_food,"Flaxseed, ground",12,2022-04-28
2263887,9316,,2263887,foundation_food,"Strawberries, raw",9,2022-04-28
2263888,9302,,2263888,foundation_food,"Raspberries, raw",9,2022-04-28
2263889,9050,,2263889,foundation_food,"Blueberries, raw",9,2022-04-28
2263890,100279,,2263890,foundation_food,"Grapes, red, seedless, raw",9,2022-04-28
2263891,100280,,2263891,foundation_food,"Grapes, green, seedless, raw",9,2022-04-28
2263892,9401,,2263892,foundation_food,"Applesauce, unsweetened, with added vitamin C",9,2022-04-28

So I think, this is how I can get all foundation foods from the provided dataset. Could you please confirm this @Kyle-McKillop , and also confirm my understanding from previous comment when you get a chance?

Thanks a ton!

hhimanshu commented 1 year ago

@Kyle-McKillop , could you please help with above questions? Thank you

KyleMcKillop-USDA commented 1 year ago

@Kyle-McKillop , could you please help with above questions? Thank you

Thanks @hhimanshu for bringing it back to the top of my inbox!

First, regarding the branded food identification: The intent is that GTIN_UPC is a unique identifier on the product. The reality is that there are two incoming data_sources for branded foods: [LI, GDSN] and that there might be unintended duplicates in GTIN_UPC. Therefore, the system handles uniqueness as a pairing of source and GTIN_UPC. ("LI", 012345678) would be considered unique from ("GDSN", 012345678)

KyleMcKillop-USDA commented 1 year ago

@hhimanshu

Second, you are correct in your fixed query. These are the list of items for Foundation Foods. You can always compare against the list as it appears in the application via https://fdc.nal.usda.gov/fdc-app.html#/?query=

hhimanshu commented 1 year ago

This is great help, thank you, @Kyle-McKillop