scramjetorg / scramjet

Public tracker for Scramjet Cloud Platform, a platform that bring data from many environments together.
https://www.scramjet.org
MIT License
253 stars 20 forks source link

Fix array nesting in delegate #107

Closed SterlingKelley closed 3 years ago

SterlingKelley commented 3 years ago

It appears to me that the clusterFunc method passed to a delegate method is being too deeply nested.

Example code that should work:

const { DataStream } = require("./lib");

function* gen() {
    for (let z = 0; z < 1e3; z++) yield z;
}

DataStream.fromIterator(gen())
    .distribute((stream) => {
        return stream.filter((num) => {
            if (num < 2) return false;
            if (num == 2) return true;
            for (var i = 2; i < num / 2; i++) {
                if (num % i === 0) return false;
            }
            return true;
        });
    })
    .toArray()
    .then(console.log);

When running I get the following output:

[
  {
    error: {
      message: 'The "id" argument must be of type string. Received null',
      stack: 'TypeError [ERR_INVALID_ARG_TYPE]: The "id" argument must be of type string. Received null\n' +
        '    at validateString (internal/validators.js:117:11)\n' +
        '    at Module.require (internal/modules/cjs/loader.js:1035:3)\n' +
        '    at require (internal/modules/cjs/helpers.js:77:18)\n' +
        '    at /Users/sterling/workplace/oss/scramjet/lib/stream-child.js:24:35\n' +
        '    at Array.map (<anonymous>)\n' +
        '    at /Users/sterling/workplace/oss/scramjet/lib/stream-child.js:17:55\n' +
        '    at processTicksAndRejections (internal/process/task_queues.js:97:5)'
    }
  },
  ... 15 more items
]

I've traced this back to an issue between multi-stream.js wrapping the clusterFunc in an extra array and stream-worker.js failing to serialize it to a string.

This 1 line change appears to enable both passing in a function directly as above as well as maintaining the ability to pass in a string referencing a module to require. Output of above function with this patch:

[
   97, 193, 257, 353, 449, 577, 641, 673, 769, 929,   2,   3,
   67, 131, 163, 227, 419, 547, 643, 739,   4,   5,  37, 101,
  197, 229, 293, 389, 421, 613, 677, 709, 773, 997,   7,  71,
  103, 167, 199, 263, 359, 487, 647, 743, 839, 967,  41,  73,
  137, 233, 457, 521, 617, 809, 937,  11,  43, 107, 139, 331,
  491, 523, 587, 619, 683, 811, 907, 971,  13, 109, 173, 269,
  397, 461, 557, 653, 877, 941,  47,  79, 239, 271, 367, 431,
  463, 719, 751, 911,  17, 113, 241, 337, 401, 433, 593, 881,
  977,  19,  83, 179,
  ... 69 more items
]

Please double check any other impacts this may have before accepting this change as I have little familiarity with the code base, just went down a rabbit hole trying to understand this specific error.

MichalCz commented 3 years ago

Thanks, I'll be chasing the rabbit tomorrow. :)

On Fri, 14 May 2021, 01:26 Sterling, @.***> wrote:

It appears to me that the clusterFunc method passed to a delegate method is being too deeply nested.

Example code that should work:

const { DataStream } = require("./lib");

function* gen() { for (let z = 0; z < 1e3; z++) yield z; }

DataStream.fromIterator(gen()) .distribute((stream) => { return stream.filter((num) => { if (num < 2) return false; if (num == 2) return true; for (var i = 2; i < num / 2; i++) { if (num % i === 0) return false; } return true; }); }) .toArray() .then(console.log);

When running I get the following output:

[ { error: { message: 'The "id" argument must be of type string. Received null', stack: 'TypeError [ERR_INVALID_ARG_TYPE]: The "id" argument must be of type string. Received null\n' + ' at validateString (internal/validators.js:117:11)\n' + ' at Module.require (internal/modules/cjs/loader.js:1035:3)\n' + ' at require (internal/modules/cjs/helpers.js:77:18)\n' + ' at /Users/sterling/workplace/oss/scramjet/lib/stream-child.js:24:35\n' + ' at Array.map ()\n' + ' at /Users/sterling/workplace/oss/scramjet/lib/stream-child.js:17:55\n' + ' at processTicksAndRejections (internal/process/task_queues.js:97:5)' } }, ... 15 more items ]

I've traced this back to an issue between multi-stream.js wrapping the clusterFunc in an extra array and stream-worker.js failing to serialize it to a string.

This 1 line change appears to enable both passing in a function directly as above as well as maintaining the ability to pass in a string referencing a module to require. Output of above function with this patch:

[ 97, 193, 257, 353, 449, 577, 641, 673, 769, 929, 2, 3, 67, 131, 163, 227, 419, 547, 643, 739, 4, 5, 37, 101, 197, 229, 293, 389, 421, 613, 677, 709, 773, 997, 7, 71, 103, 167, 199, 263, 359, 487, 647, 743, 839, 967, 41, 73, 137, 233, 457, 521, 617, 809, 937, 11, 43, 107, 139, 331, 491, 523, 587, 619, 683, 811, 907, 971, 13, 109, 173, 269, 397, 461, 557, 653, 877, 941, 47, 79, 239, 271, 367, 431, 463, 719, 751, 911, 17, 113, 241, 337, 401, 433, 593, 881, 977, 19, 83, 179, ... 69 more items ]

Please double check any other impacts this may have before accepting this change as I have little familiarity with the code base, just went down a rabbit hole trying to understand this specific error.

You can view, comment on, or merge this pull request online at:

https://github.com/scramjetorg/scramjet/pull/107 Commit Summary

  • Fix array nesting in delegate

File Changes

Patch Links:

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/scramjetorg/scramjet/pull/107, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAJ3RBGNAINVT6XJ4MTC2VDTNRN2DANCNFSM443NQSDQ .

MichalCz commented 3 years ago

Hi, I'm checking this today and will release the fix.