threefoldtech / tfchain_graphql

Graphql for TFchain
Apache License 2.0
2 stars 3 forks source link

Mismatch between graphql data and whereisip #148

Closed MohamedElmdary closed 6 months ago

MohamedElmdary commented 9 months ago

Naming of country czechia comes from whereisip however graphql return another naming Czech Republic

image

image

samaradel commented 8 months ago

Similar case for United States and United Kingdom image image

sameh-farouk commented 8 months ago

Our squid processor uses this API https://restcountries.com/v2/all to initialize the countries table. country names could have differences between different APIs. can we show the country's name in UI but rely instead on the country code for any filtration etc?

sameh-farouk commented 8 months ago

There is a v3 of the API that we are using in Graphql (currently we are using v2). The name becomes an object that contains a common name (looks like what you want) and an official name (similar to the one we are using ATM). For example United Kingdom name-value looks like this

{ 
common": "United Kingdom",
"official": "United Kingdom of Great Britain and Northern Ireland",
...
}

and for Czechia

{
common": "Czechia",
"official": "Czech Republic",
...
}

so, I can switch from v2 to v3 in Graphql init script and use the common name. This would solve the reported instances from @MohamedElmdary and @samaradel but I am not sure if there is an easy way to retrieve all country names from the whereisip API to verify that we won't have another mismatch.

Omarabdul3ziz commented 7 months ago

hi, i was debugging an issue on gridproxy https://github.com/threefoldtech/tfgrid-sdk-go/issues/699 it shows a broken region filter for nodes.

how the region filter work on the proxy? we join the node table with the country table based on the condition node.country = country.name

but looks like there is an inconsistency between the data returned from the graphql the common short name is the one used on the node table but the official full name is used on the country table image

so that is why /nodes?region=americas doesn't list the united statesd nodes. same with the United Kingdom and Czechia

sameh-farouk commented 7 months ago

Update: PR is ready for review.

It is based on the changes I mentioned in my previous comment.

As I mentioned in the PR description, this is a temporary, easy workaround fix. However, although it should resolve the mismatch for the reported countries, it is important to note that this fix may not work for all countries as the process of joining data from different APIs based on country names is not reliable.

To elaborate, in our squid processor, we fetch cities from the repository available at https://raw.githubusercontent.com/shivammathur/countrycity/master/data/geo.json and countries from the https://restcountries.com/ API. We then join these data based on the country name. Similarly, in the Gridproxy, we fetch node data from the chain and county data from GraphQL and join it based on the country name.

This process is not always accurate as there are often alternative spellings for country names. Hence, unless you are using the same data source, it's not guaranteed to find a match.

To properly resolve this issue, I suggest using the same API (if possible), or a more unique identifier such as country code. Also, we could sort to craft our data source and use it. However, this would require changes in multiple services. hence I'm offering this patch as a temporary solution for now.

sameh-farouk commented 7 months ago

All is good now? @MohamedElmdary @samaradel @Omarabdul3ziz Patch on dev, qa and test networks

Omarabdul3ziz commented 6 months ago

yes all good now on 4 nets, thanks image