USDA / USDA-APIs

Do you have feedback, ideas, or questions for USDA APIs? Use this repository's Issue Tracker to join the discussion.
www.usda.gov/developer
107 stars 16 forks source link

Data format Backwards Compatibility and Consistency with previous Food Composition API #74

Open Nutriadmin opened 4 years ago

Nutriadmin commented 4 years ago

I have been experimenting with the Food Data Central API, and I have noticed several differences in the data format when compared with the USDA Food Composition Database API.

For instance, if I want to query the food list for a term like "banana" including just the SR Legacy data type using the following body in my POST request:

{
    generalSearchInput: 'banana',
    includeDataTypes: {'Survey (FNDDS)': false, 'Foundation': false, 'Branded': false, 'SR Legacy': true}
}

The response contains an array of foods like the one in the example below:

{
    allHighlightFields: "",
    dataType: "SR Legacy",
    description: "Bananas, raw",
    fdcId: 173944,
    ndbNumber: "9040",
    publishedDate: "2019-04-01",
    scientificName: "Musa acuminata Colla",
    score: 356.03656,
}

Now the issue is that there are several differences to the way these results used to be with USDA Food Composition Database:

  1. The "ndbNumber" is "9040". In the previous USDA db, there was a leading zero, so the id would be "09040".
  2. The previous API included a "category" field, (e.g. vegetable products, meat products, fish products, etc).
  3. Names of properties are different. E.g. "ndbNumber" instead of "ndbno", or "description" instead of "name".

Are these differences intentional? and are there any plans to make the format more consistent with the previous API? We will have to make several changes to data formats and data processing code in our application to deal with the new conventions. Also, we currently use the "category" field to filter items by category, but this field doesn't seem to be available anymore.

Sorry if I am missing something, in our case at least it would be convenient if the data structure was consistent with the previous API. Please let me know if there is something I can do or if there are plans to change the data format.

littlebunch commented 4 years ago

@Nutriadmin I've sent your comments to the FDC developers. Thanks. It's just the sort of feedback the developers and principals need to provide a useful FDC API well ahead of the 31 March 2020 cutoff. My personal opinion is that some things, such as naming compatibility, additional elements such as categories and padding NDB numbers, ought to be pretty straightforward. I'm not so sure about data structure consistency. I hope we have some definitive answers soon.

Nutriadmin commented 4 years ago

That's amazing, thanks so much. I would imagine that in general, most of the current users of the Food Composition Database would appreciate it if the new API was as compatible as possible with the new version, so that migration is easier.

I have also noticed another 3 changes that could be hard to work around for existing applications.

Issue 1 When searching for details of a food, would it be possible to query by ndbNumber, rather than using the new ID number?

E.g. To get details on a banana, the following works: curl -H “Content-Type:application/json” https://api.nal.usda.gov/fdc/v1/173944?api_key=MY_KEY But I haven't found a way to query by the old ndbNumber: 9040 (or ideally 09040 with the padding).

In our particular case we could work around this limitation with some extra work, but if it's easy to implement on your end that would be appreciated.

Issue 2 I don't seem to be able to get portion/measures data when I query for a food report? Would it be possible to add a parameter to the query so that this information can be retrieved like in the previous API

Issue 3 The naming and structure for nutrients has changed, for example, this is how "Water" looks like now in the array of nutrients for the banana query: {"type":"FoodNutrient", "id":1817955, "nutrient":{ "id":1051, "number":"255", "name":"Water", "rank":100, "unitName":"g" }, "dataPoints":20, "foodNutrientDerivation":{"id":1,"code":"A","description":"Analytical","foodNutrientSource":{"id":1,"code":"1","description":"Analytical or derived from analytical"}}, "max":78.20000000, "min":71.30000000, "amount":74.91000000},

Previously it looked like this: { "nutrient_id" : "255", "unit" : "g", "name" : "Water", "value" : 74.9 }

The changes are:

Whilst is great to have more data, I wonder if it would be possible to keep the names and "nestedness" the same as before. E.g. so that we can continue to do nutrient.nutrient_id in the code rather than nutrient.nutrient.id as it would be with FDC.

If these changes are not possible it wouldn't be the end of the world in our case, but they would be appreciated if it's feasible to implement them in your end.

Thanks for taking the time to consider this feedback.

humandoing commented 4 years ago

Hi @littlebunch

I'd love to add to the comments from @Nutriadmin that, unless I'm crazy, it doesn't look like there is any way to perform a direct query / lookup of a given food item based on ndbNo.

It feels to me like this could be a significant issue for many users who previously in existing software implementations used ndbNo basicallly as a foreign key for looking up or refresh locally cached data from the SR data in the previous API.

My big concern here is migration -- as users of the deprecated API, how do we map old ndbNo food items to new fdcId items easily? Aside from doing a search that includes only the SR data types (as @Nutriadmin does above with banana), and then iterating through results until you find an ndbNo match, it doesn't seem that this is possible.

It would be super nice to be able to query the new FDC API for food details using ndbNo as a query parameter.

Thanks for your consideration!

littlebunch commented 4 years ago

@humandoing One issue in using the ndbNo (ndbNumber) for a food details call is that it is no longer unique -- it can be used for a Foundation food or for a SR Legacy food. (And, it is no longer used in Branded Foods at all.)

There will probably be a new food details call that returns multiple foods like the current API. It might be possible to add a query/filter parameter to this call. It's worth the discussion.

Also, you can search on ndbNumber. Something like this: curl -XPOST -H "Content-type:application/json" https://api.nal.usda.gov/fdc/v1/search?api_key=DEMO_KEY -d '{"generalSearchInput":"ndbNumber:9181","includeDataTypes":{"Survey (FNDDS)": false, "Foundation": false, "Branded": false, "SR Legacy": true}}' which finds a unique match from which you would need to parse the fdcId to get the food details entry point. Less than optimal I agree.