h3abionet / afpo

AfPO: African Population Ontology
1 stars 0 forks source link

Create report about population #47

Open anitacaron opened 2 months ago

anitacaron commented 2 months ago

Fixes #46

I'd like to make sure we have all the information needed. The report has more than 12 thousand rows, so I'll upload it in google drive and share the link to be downloaded.

Here's a sample of the table: (click on the image to see it larger)

Screenshot 2024-06-24 at 16 55 20

Melek-C commented 2 months ago

Hi @anitacaron it looks like there is many duplicates, if you skip the family column, the table would be much more smaller.

anitacaron commented 2 months ago

Yes, but I did that so it can be easily grouped by family or country of origin. Or don't you need the family information? Can I get feedback from Meriem and Mariem, please?

Melek-C commented 2 months ago

Hi @mariemh23 could you please check if it's ok for you to plot the map?

mariemh23 commented 2 months ago

I think yes, it looks fine.

mariemh23 commented 2 months ago

Hi Anita, I took a closer look at the table and I think that there is something wrong with the database. 

First, there seems to be a separator problem in the table. Some values have been shifted. For example, some values in the region_name column contain population_name values (e.g. line 6628). Similarly, in the population_size column, there are also shifted values and some values are missing. There are certain values that contain the symbol ">" or "<" (< 1 million, >3 million) that disturbs the conversion of population sizes into numerical values. One last point, as raised earlier by Alia, we need a top group family, because with such a large number of values, it will be difficult to assign sufficiently different colors for each group to be easily visible on the map. Best,

anitacaron commented 2 months ago

@mariemh23, I can fix the region_name and the family columns. However, the population_size is what is available in the ontology, and some values are missing. It would be good to discuss this with @abenkahla so I can change the ontology or just have a post-processing step to remove the symbol > in the population size annotation.

anitacaron commented 2 months ago

@mariemh23 I've updated the table in the Google spreadsheet. Could you please check?

Melek-C commented 2 months ago

Hi @anitacaron thanks for the upadtae, we just checked the table with @mariemh23, all the columns looks good excepet the family one as it's not standardized and not presented in a harmonized way. We should move forward with the map draft untill we fix the family column.

mariemh23 commented 2 months ago

Hi anita, Thank you for your quick reply. I would also like to ask you about the language location column.is it possible to have the geo-location coordinates in a separate column (separated from the language name). Many thanks

anitacaron commented 2 months ago

it's not standardized and not presented in a harmonized way

@Melek-C let me know how I can change the family column

it possible to have the geo-location coordinates in a separate column (separated from the language name).

@mariemh23 yeah, I can do it in the spreadsheet, but this is how it's available in the ontology

Melek-C commented 2 months ago

HI @anitacaron, it's a bit tricky with the family column. I think we should keep just one term in the column and select only big families (there different subfamilies).

anitacaron commented 2 months ago

Maybe we could put the subfamilies in another annotation to make the ontology clearer?

Melek-C commented 1 month ago

Maybe we could put the subfamilies in another annotation to make the ontology clearer?

It could be interesting.

anitacaron commented 1 month ago

Do we have a final decision about the family annotation for the report? 😄

Melek-C commented 1 month ago

Hi @anitacaron we are trying to fix some ambiguities with the family annotation as it's not standardized. We will back to you soon.