grafana / k6

A modern load testing tool, using Go and JavaScript - https://k6.io
GNU Affero General Public License v3.0
24.94k stars 1.23k forks source link

Shared-memory read-only files #1931

Open na-- opened 3 years ago

na-- commented 3 years ago

Prompted by https://community.k6.io/t/ways-to-transfer-files-in-tests/1552

After https://github.com/k6io/k6/pull/1841, can we add an optional argument (e.g. r) to open() that would allow us to have only a single copy in memory. Like a SharedArray, but for string/ArrayBuffer data?

And unlike SharedArray(), we probably don't even need to have a name, we can use the absolute path to the file as the cross-VU sharing key.

This would solve the pain point of being able to load test services that expect large file uploads, which are currently impossible with more than a few VUs, even with SharedArray.

As a bonus, this sync.Once-like behavior will also fix https://github.com/k6io/k6/issues/1771, right?

mstoykov commented 3 years ago

I have to add that the current way bodies are handled does at least 1 additional copy to send the body and 1 to keep it as response.request.body which means that while the above is probably great, not fixing those issues first will not have as big of an effect as probably everybody expects.

na-- commented 3 years ago

Hmm didn't we fix some of these things with https://github.com/k6io/k6/pull/1841 or the previous moves to ArrayBuffer? And if not, aren't these things fixable with enough Proxy magic?

Assuming one such shared binary file is just a wrapped []byte on the Go side, with Proxy and freezing on the JS side to keep it immutable, we should be able to pass it around without any copies, right?

mstoykov commented 3 years ago

Hmm didn't we fix some of these things with #1841 or the previous moves to ArrayBuffer?

I am not certain it has any relevance, although it seems for []byte and possibly ArrayBuffer it just works ... possibly if I haven't missed one place where an actual copy is done. If you are talking about the response.request.body - that isn't touched at all

And if not, aren't these things fixable with enough Proxy magic?

in theory - yes, but someone will need to do it ;) And it will need to be done for the whole response or request object as you can't just proxy 1 field.

Assuming one such shared binary file is just a wrapped []byte on the Go side, with Proxy and freezing on the JS side to keep it immutable, we should be able to pass it around without any copies, right?

again in theory - yes, but someone will need to do it, which given the way k6/http integrates with httpext will probably not be as easy.

Also, I would argue that a lot of people will put those in things like FormData which ... will just copy it. Even using it with http.File and the magical object->form-data transformation will make copies (on each call in each VU).

I am not saying that it is impossible or even that it will be hard (for some definition), just that, just having "immutable" returns to open does mostly nothing if we also don't integrate it with what it will be used in.

na-- commented 3 years ago

Regarding Response.request.body, we should be able to just reference the original ArrayBuffer or string JS variable in Response.request.body, right? That will prevent any unnecessary copying and will also work for the SharedBlob object (or whatever the thing we implement in this issue is going to be called).

The only time Response.request.body should be different from the request body is when k6 marshals it, i.e. the user has passed a plain JS object that k6 internally encoded into a application/x-www-form-urlencoded or a multipart/form-data content-type string.

anton-makarov-photobox commented 3 years ago

@na-- Hi. You mention that you open the file with SharedArray. Do you have an example of it?

 k6:
    image: loadimpact/k6:latest
    networks:
      - performance_test_network
    volumes:
      - ./dist:/scripts
      - ./assets:/assets
    ports:
      - "6565:6565"
    environment:
      - K6_OUT=influxdb=http://influxdb:8086/k6

I put a folder with images in the docker container. Then read a file with the name of these files and try to open all files with SharedArray

const assetsList: any = new SharedArray('assetsList', () => {
  return JSON.parse(open(`${assetsFolderPath}/${listFile}`));
});

const assets: any = new SharedArray('assets', () => {
  return assetsList.map((asset) => open(`${assetsFolderPath}/${asset.name}`, 'b'));
});

  for (const asset of assets) {
   const data = new FormData();
   data.append('photo', http.file(asset, fileName));
....
http.post(url, data.body());

}

Error what we have: ERRO[0028] invalid type map[string]interface {}, expected string, []byte or ArrayBuffer

We decided to use SharedArray because we had an error in k6 cloud when we tried to run our script with 500 VUs. We didn't have SharedArray and it's worked fine, with low amount of VUs

const assetsList = JSON.parse(open(`${assetsFolderPath}/${listFile}`));

const assets = assetsList.map((asset) => {
  return {
    file: open(`${assetsFolderPath}/${asset.name}`, 'b'),
    name: asset.name,
  };
});
na-- commented 3 years ago

@anton-makarov-photobox, I gave SharedArray as an example of a read-only data structure that's shared between multiple VUs. And I used it because it specifically doesn't support binary data, so it doesn't solve the issue of having large binary files that you want to upload.

Unfortunately, there is no good way to do this in k6 yet... :disappointed: It's why this issue exists, to solve that problem. Until we implement it though, I'd suggest using fewer VUs (because every VU will keep a copy of the opened files in memory) with http.batch() for concurrency.