OpenRefine / SparqlExtension

Extension which lets you create an OpenRefine project from a SPARQL query
BSD 3-Clause "New" or "Revised" License
5 stars 3 forks source link

Skip the level selection step #3

Closed antoine2711 closed 1 year ago

antoine2711 commented 2 years ago

Description

The step where you choose the JSON import level can be skipped, it's always the same. The column name should come from the JSON fields name of the first record.

So the project name and tags could/should be asked the step before.

Screenshot

image

wetneb commented 2 years ago

Yes. I would expect this extension not to rely in the json importer at all actually, even if it retrieves the results in json internally

WaltonG commented 2 years ago

Sure. I think i get the logic

WaltonG commented 1 year ago

Hi @wetneb I am working on a custom json parser but am having challenges in understanding how the parsePreview and TabularImportingParserBase.readTable methods in the existing extensions work.

  1. Should the raw data be passed to the initialize-parser-ui sub command?
  2. How is the raw data handled by the parsePreview and TabularImportingParserBase.readTable methods?
  3. How is the raw data then returned to the frontend for preview?

I have gone through several extensions but didnt understand how the raw data gets parsed before being displayed at the preview page

wetneb commented 1 year ago

Hi @WaltonG, The raw data is generally read from the backend directly. That means, in your case I would expect that the SPARQL request is not done by the frontend (as is currently the case) but rather by the backend.

To get a simple example, have you looked at the CommonsExtension? It is very similar, because it also lets people create projects from an API. In particular, the ImportingController should be interesting for you. I think it should answer your second question.

I would expect the general process to look like this.

The initialize-parser-ui sub-command does not do any data processing itself, it just returns the default values for the options displayed in the preview stage.

As you can see from the CommonsExtension, the preview and project creation stage share a lot of the same code: the only difference between the two is the limit on the number of rows. In your case it could make sense to include this limit in the SPARQL query, if you have a safe way to do so (it might require parsing the SPARQL query).

One thing I would encourage you to get an overview of the requests that are made, is to try out the CommonsExtension with your web developer panel open, so that you can see the network requests made by the frontend to the backend (and inspect their parameters and responses).

Let me know if any of this is unclear.

WaltonG commented 1 year ago

Hi @wetneb I now have an understanding of what is expected and how to achieve that. In case of any clarifications I will reach out

WaltonG commented 1 year ago

@wetneb What could be the reason behind switching from Apache HttpClient in main OpenRefine to OkHttpClient in the CommonsExtension?

wetneb commented 1 year ago

I am not sure why @j-sal used one or the other, you could ask her directly. I think it would work with both.