apache / arrow

Apache Arrow is a multi-language toolbox for accelerated data interchange and in-memory processing
https://arrow.apache.org/
Apache License 2.0
13.9k stars 3.38k forks source link

[JS] Arrow Flight JavaScript Client or Example #17325

Open asfimport opened 3 years ago

asfimport commented 3 years ago

Is it possible to use Apache Arrow Flight to send data from a Python Web Server to a JavaScript browser client? If it is possible, is there a code example to use to get started? 

 

If this is not possible, what is the fastest way to send data from a Python Web Server to Apache Arrow in the browser today? Would it be faster to send a Parquet file and unpack it client-side, or send Arrow directly/with gzip/ etc.?

Reporter: Alex Monahan

Note: This issue was originally created as ARROW-9860. Please see the migration documentation for further details.

asfimport commented 3 years ago

Micah Kornfield / @emkornfield: I don't think there is an implementation of flight in JS yet.  Probably the fastest way to get data to the browser is to [RecordBatchStreamWriter|[https://arrow.apache.org/docs/python/ipc.html]] from python over a websocket and the reading in JS.  If bandwidth is a consideration, python should now support compressing buffers (I need to double check) but I don't think this support has been added to the JS implementation yet.

 

CC  @TheNeuralBit [~paul.e.taylor] to correct any inaccuracies.

asfimport commented 3 years ago

Paul Taylor / @trxcllnt: JS does not support buffer-level compression, and possibly shouldn't ever, due to the significant drawbacks of adding JS or WASM-based compression implementations to the browser bundles such as perf hit in the readers/writers, significant addition to library size, etc.

The only widely/natively supported deflate implementation in browsers is gzip (and to a lesser extent, brotli), but deflate is applied at the message/chunk level in the browser's networking stack, so compression must be applied to the entire payload.

asfimport commented 3 years ago

Hugh Matsubara: [~paul.e.taylor] So this means that [~emkornfield@gmail.com]'s suggestion without the buffer compression part should still work, correct? 

May I ask if there is any roadmap for implementing flight in JS? Are there any major technical roadblocks (like unimplemented features in web-grpc) or limitations that are known to the community? 

asfimport commented 3 years ago

David Li / @lidavidm: Hey [~Hugheym] it's mostly a question of effort, and approach (grpc-web, one of the gRPC proxies, a whole native REST-based implementation?) Also the various gRPC-in-the-browser implementations don't support bidirectional streaming - so we'd need to figure out what to do with DoPut and DoExchange. Finally I think the JS Arrow library has lagged behind a bit and so some work is needed there too (e.g. the 1.0.0 format changes).

All in all, I think it's feasible, but needs someone to drive it. I'm happy to answer questions about Flight and help with the effort as I have been interested, but never had the time to drive this by myself.

SanthoshBanavath commented 1 year ago

Hi, can you please update if there is any update on this effort?

martinberoiz commented 11 months ago

Hello, I am also in this same situation. I made a small grpc-web demo (you can take a look here https://git.ligo.org/martin.beroiz/grpc-demo) and I'm now expanding it to replace the grpc server to an arrow flight server.

The server side works as intended without much change, but I'm struggling on the browser side. I compiled the arrow flight protobuf's and included them in the web app (I can upload a work-in-progress repo if anyone is interested)

The client.js side looks a bit like this:

import { HandshakeRequest, Criteria } from "./arrow_pb.js";
import { FlightServiceClient } from "./arrow_grpc_web_pb.js";

var myFlightService = new FlightServiceClient("http://localhost:8080");

$(document).ready(function () {
  $("#send-button").on("click", async function (e) {
    e.preventDefault();

    var request = new HandshakeRequest();
    var criteria = new Criteria();

    myFlightService.listFlights(request, criteria).then((err, response) => {
      if (err) {
        console.log(err);
      }
      console.log(response);
    });
  });
});

This doesn't work mainly because I don't know much about calling flight from a client in general.

  1. the function listFlight requires a request argument that I don't know how to create
  2. I have no idea what myFlightService.listFlight returns (a promise?) or how to deal with it to get a response and unpack the data after

A short example with a do_get request or something of the sort could help tremendously. Hopefully this will help someone move forward with a working example.

pbower commented 3 weeks ago

Hello, I am also in this same situation. I made a small grpc-web demo (you can take a look here https://git.ligo.org/martin.beroiz/grpc-demo) and I'm now expanding it to replace the grpc server to an arrow flight server.

The server side works as intended without much change, but I'm struggling on the browser side. I compiled the arrow flight protobuf's and included them in the web app (I can upload a work-in-progress repo if anyone is interested)

The client.js side looks a bit like this:

import { HandshakeRequest, Criteria } from "./arrow_pb.js";
import { FlightServiceClient } from "./arrow_grpc_web_pb.js";

var myFlightService = new FlightServiceClient("http://localhost:8080");

$(document).ready(function () {
  $("#send-button").on("click", async function (e) {
    e.preventDefault();

    var request = new HandshakeRequest();
    var criteria = new Criteria();

    myFlightService.listFlights(request, criteria).then((err, response) => {
      if (err) {
        console.log(err);
      }
      console.log(response);
    });
  });
});

This doesn't work mainly because I don't know much about calling flight from a client in general.

1. the function listFlight requires a `request` argument that I don't know how to create

2. I have no idea what `myFlightService.listFlight` returns (a promise?) or how to deal with it to get a response and unpack the data after

A short example with a do_get request or something of the sort could help tremendously. Hopefully this will help someone move forward with a working example.

Hi, I'm wondering how you went with this ? I'm also looking for a flight RPC JS browser option. Have been looking into Rust WebAssembly too but the compilation and package dependencies for tonic, tokio, socket2 and mio are causing grief. If you cracked it I'd love to hear how you did it. Thanks!!

martinberoiz commented 3 weeks ago

@pbower I unfortunately could not move past that point. My javascript skills are lacking and I don't understand how to process the flight messages into JS objects. I think that without a proper javascript library it will be quite the challenge.

martinberoiz commented 3 weeks ago

oh and for the record, I also tried the Rust WebAssembly route, but it didn't work for me because there are many rust dependencies (even built-in ones IIRC) that deal with OS calls and I/O and other stuff that fall out of scope of what can be compiled to WA and run in a browser. So I gave up on that as well.