micronutrientsupport / fct

Cleaning and standardisation of food composition tables
1 stars 2 forks source link

Some help needed! #1

Closed LuciaSegovia closed 4 years ago

LuciaSegovia commented 4 years ago

Hi @rbroth !!

I have uploaded 10 FCT with the key variables standardized, let me know what you think.

Also, I have some questions for you!

Q1: Can we 'flag' values in the FCT that are of low quality?

How to do when they are marked in the FCT as between [] or () or marked with:

-different font -different colour font -in bold -with *

Q2: Are you going to import the original fct and change them in the tool, or are you going to import the cleaned version into the tool?

Thanks!!

Lucia

rbroth commented 4 years ago

Hi @LuciaSegovia,

Where have you uploaded them? In MS Teams?

1) Yes, we can flag/tag FCT entries with whatever we want. We can do this either by adding a property to a Fooditem (So it would go e.g. "ID, name, energy_in_kcal, calcium_in_mg, ... , is_low_quality_YN"), or by setting up a tagging system, so a food item could have the "low-quality", "refrigerated" and "dairy" tags. What solution is preferable depends on the use case.

2) I will clean the FCT data that we have, convert it into SQL INSERT statements and run that against the database. The tool itself is so far just the database that contains FCt data converted into a consistent format - hence my questions about things like "is phytic acid the same as phytin?" and such; I'm trying to figure out what the various columns in the FCTs mean.

LuciaSegovia commented 4 years ago

Hi @rbroth ,

I uploaded in Teams now (in FCT_working-files) 👍 Only variable names are 'standardized'. The next step it would be to standardize the food groups (cereals, vegetables, etc.)

The thing with 'tagging' is that some particular nutrients for a specific food item, they will be marked as low quality (i.e. iron content in maize flour is low quality). Then, I think I will take your suggestion of having a column with low_Q_Nutrient (YES or NO) for every single nutrient, What do you think?

Regarding what each column means, I have uploaded a Markdown with all the 'decisions' that I made when assigning a particular tagname for the FCT. I think that it might explain some of the columns, otherwise you can ask me, if you want!

By the way, I can't access the link you shared in Teams.

Thanks!

Lucia

rbroth commented 4 years ago

Thanks, I can see the files. I've just had a quick look.

Regarding tagging, I'd like to know more about the use-case before I can give a good recommendation. It sounds like something that can get quite complicated. Let's split that into its own issue to discuss.

Thanks for the markdown file, that will be very useful.

The links I gave to Louise are for BGS' internal Gitlab (sort of like GitHub) instance, so unfortunately they're only accessible on the BGS internal netwrok. Once I've got the ok from Louise I intend to upload everything to Github so everyone else cna access it more easily.

LuciaSegovia commented 4 years ago

For the categories/food group, I started 'cleaning' that (like in MAPS_WAFCT), then I realised that we have to standardize all food groups and I stopped, so it's only done once with the correct food group name. Ok, I just realised that we still need to keep the original food groups...Could you help me with that? It's only for MAPS_GMB and MAPS_KEN...

Regarding your question about MAPS_WAFCT, I don't know what is an text encoding error (sorry!). But, if you mean NA values, some are related to the tagging issue.

Also, there are two FCT that I don't think we are going to use MAPS_ETHFCT and MAPS_NGA.

Thank you!

UPDATE: I just updated KENFCT1 in Teams. We don't need to do GMBFCT, we are not going to use that FCT either.