web4bio / webgen

WebGen Vertically Integrated Project
https://web4bio.github.io/webgen/main/html/
11 stars 26 forks source link

ENH: allow a firebrowse fetch to be split by parameters #338

Closed kaczmarj closed 3 years ago

kaczmarj commented 3 years ago

this pull request implements splitting of large fetches into multiple smaller fetches. this removes our dependency on the episphere firebrowse.js module.

the main addition of this pr is the groupBy parameter to fetchFromFireBrowse. the parameters are split by the keys specified in groupBy. here is an example:

given the following parameters and group-by variables

params = {
  "foo": ["a", "b", "c", "d"],
  "bar": ["w", "x", "y", "z"],
  "cat": "dog",
};
groupBy = [{
  key: "foo",
  length: 2
}, {
  key: "bar",
  length: 3
}];

the parameters will be split into multiple objects

[{
  bar: ["w", "x", "y"],
  cat: "dog",
  foo: ["a", "b"]
}, {
  bar: ["z"],
  cat: "dog",
  foo: ["a", "b"]
}, {
  bar: ["w", "x", "y"],
  cat: "dog",
  foo: ["c", "d"]
}, {
  bar: ["z"],
  cat: "dog",
  foo: ["c", "d"]
}]

this array of parameters is the cartesian product of the N-length slices of each groupBy key, where N is the length of that key. fetches are run on each of these parameters concurrently using Promise.all. the results are coalesced into a single return object.

i tested that these methods give identical outputs to their episphere counterparts (i.e., https://github.com/episphere/firebrowse/blob/a703c50643b3dd4b06777c3006f96b35ff245dc5/firebrowse.js#L67-L93) but i would appreciate it if people tested on their own.

also can anyone try stressing these fetches, by selecting many genes and cancer types before pressing submit? does this pr seem faster than the current dev branch?