RDFLib / sparqlwrapper

A wrapper for a remote SPARQL endpoint
https://sparqlwrapper.readthedocs.io/
Other
513 stars 121 forks source link

Convert SPARQLWrapper.SmartWrapper to pandas dataframe #205

Open appliedgraphs opened 2 years ago

appliedgraphs commented 2 years ago

I want to convert a SmartWrapper to a pandas DF. We have two relevant issues:

My goal is to execute a SPARQL query, get a result, and then transform to a dataframe. I am certain that I am missing something because this must be a very common need.

nicholascar commented 2 years ago

@appliedgraphs have you seen this existing script: https://github.com/RDFLib/sparqlwrapper/blob/master/SPARQLWrapper/sparql_dataframe.py

afk314 commented 2 years ago

I hadn't, but it's nice and easy to follow. It also works wonderfully! What can be done to make this easier to find or otherwise more integrated with rdflib?

nicholascar commented 2 years ago

Yes, I think PANDAS is well known enough to be supported as a standard output option. The only hitch is that we want to ensure basic installation of SPARQLWrapper doesn't require PANDAS, since it's large, so it can be included but a pip install optional will need to be made to enable it.

Something like pandas[pandas] in requirements.txt and then pip install sparqlwrapper[pandas] to install.

Would love to see a PR for this...

eggplants commented 2 years ago

@nicholascar Already: https://github.com/RDFLib/sparqlwrapper/blob/2a6e2d3ddbc3fe38ca47d6d05f23c9b61ff82366/setup.cfg#L53-L54

nicholascar commented 2 years ago

Great, thanks @eggplants, so the installation tasks are already done. So, I suppose, a deeper integration of PANDAS could be made, as long as appropriate warnings are thrown if users try and call PANDAS exports without the necessary installation.

WolfgangFahl commented 2 years ago

this is what pyLodStorage was motivated by https://github.com/WolfgangFahl/pyLoDStorage where you get list of dicts that you can immediately use in pandas see #160