diffix / desktop

Cross-platform desktop application for data anonymization with Open Diffix Elm.
https://www.open-diffix.org
Other
4 stars 2 forks source link

Frontend <> Backend communication #20

Closed cristianberneanu closed 3 years ago

cristianberneanu commented 3 years ago

We need to agree on the way the Frontend communicates with the Backend.

Since transpiling the reference code to JS resulted in poor performance, the anonymization code will stay in dotnet. Furthermore, I don't think it is a good idea to manually build the query AST in JS land. It couples the Frontend and Backend internals too much. Sending a SQL statement feels cleaner.

As input we send: filename, query statement, anonymization settings. As output we get: query result or an error.

Option 1: anonymize using the CLI.

We pass the input as command-line arguments , we get back the query result (as either CSV or JSON) in the stdout stream or we get an error in stderr stream.

PROs:

CONs:

Option 2: anonymize using IPC.

We will need an additional .NET project in this repository that loads the core reference library and dispatches anonymization requests to it. We pass the input as a JSON object and we get back a JSON object with the result or error. We need to decide if we use a socket or the process stdio streams for message exchange.

PROs:

CONs:


I am slightly in favor of Option 1 (I don't consider the drawbacks for it too big).

sebastian commented 3 years ago

I don't think it is a good idea to manually build the query AST in JS land. It couples the Frontend and Backend internals too much. Sending a SQL statement feels cleaner.

Yes, building the AST in JS only made sense as long as the AST could immediately be executed there too.

sebastian commented 3 years ago

I vote for Option 1 too.

I additionally vote for using JSON as the output as it's easier to use in the frontend than parsing some CSV output.

We can live without progress reports, and if we need it later we can get hacky then.

edongashi commented 3 years ago

Do we drop the JS CSV parser? If yes, do we use the backend to figure out the shape when we load a file? If not, we need to use 2 different CSV libraries where each may have their own tiny differences.

sebastian commented 3 years ago

Do we drop the JS CSV parser? If yes, do we use the backend to figure out the shape when we load a file? If not, we need to use 2 different CSV libraries where each may have their own tiny differences.

Good point, @edongashi.

We either need another parser for the GUI or need to extend the Reference with an endpoint that returns a schema... In either case, as long as we want to support CSV, it seems the CLI interface must be extended to support providing a schema as part of the input too!?

cristianberneanu commented 3 years ago

I say we do the CSV parsing only in the backend/reference tool. To load the initial raw data (including the schema) the frontend could issue a standard SELECT * FROM 'file_name' query.

cristianberneanu commented 3 years ago

This seems settled (at least for now).