Arquisoft / wiq_en2b

WIQ en2b
https://arquisoft.github.io/wiq_en2b/
3 stars 0 forks source link

SPARQL #11

Closed GOLASOOO closed 7 months ago

GOLASOOO commented 8 months ago

Discuss SPARQL implementations.

GOLASOOO commented 8 months ago

[Moved from issue #6 to here as it is rather a big topic] This is a bit long as an answer but I tried to simplify and make as clear as possible. I have been researching a bit on the different approaches we can choose in order to query from Wikidata.

Spring Boot:

This is some info about most of the libraries:

They are plenty and all of them seem to use REST to query the data from wikidata's endpoint. Also, as far as I am concerned, all of them use JDBC for querying. However, due to volume I have not check on every one.

✅ Pros (for most of the libraries):

UO283615 commented 8 months ago

Regarding SPARQL I think that it is important to mention this question asked in the Arquisoft FAQ, that includes a webpage that allows us to test our queries.

jjgancfer commented 8 months ago

Regarding whether we should query from the frontend or the backend, if we consider performance, then I think we should do it from the former, mainly because that way it will increase. Once that question has been answered, it can be sent to the server in the case we need it.

GOLASOOO commented 8 months ago

Sorry for such a long answer but for now, we have two possible approaches regarding queries: The first one would be using user agent as a client to the SPARQL Query Service, as you suggested.

Pros ✅:

Cons ❌:

On the other hand, we could have a service that automatically stores questions and its answers on a DB which are given to the user by the backend (it is more secure than querying from the user, in my opinion). This service could be running 24/7 storing answers to questions until a certain amount is reached for example.

Pros ✅:

Cons ❌:

Thanks for your opinion, it is helpful as we have not decided on which to use yet.

jjgancfer commented 8 months ago

On the other hand, we could have a service that automatically stores questions and its answers on a DB which are given to the user by the backend (it is more secure than querying from the user, in my opinion). This service could be running 24/7 storing answers to questions until a certain amount is reached for example.

I agree with this approach. I think it would be the best approach, mainly because with the new focus for questions/answer we would have to use multi-threading until the database is filled up to a useful degree. We would also be able to create it in a simpler language (we could, for instance, use Django and use a Python module to query). We can also find different tools here.

Cons ❌:

  • More complex architecture and more resources are required even if they end up not being used.
  • We will possibly have problems regarding bandwidth if many users playing simultaneously.
  • More complex code and service interactions.

I think this disadvantages are worth mentioning, but I think the second one is negated by the new model of querying early for questions and storing them. Likewise, the first might be slightly negated if we simplify it (making it something like running an script, for instance), but I think that kind of solution would be sub-optimal.

UO283615 commented 7 months ago

I've been working on the queries and I managed to write two queries that return all the countries and capitals in the world, one in XML format with a POST request and one in JSON with a GET request. I've appended the JSON with them to this message, said JSON can be imported into Postman and the queries will be loaded there. ASW-Queries.json