breejs / bree

Bree is a Node.js and JavaScript job task scheduler with worker threads, cron, Date, and human syntax. Built for @ladjs, @forwardemail, @spamscanner, @cabinjs.
https://jobscheduler.net
MIT License
3.06k stars 79 forks source link

[question] workerData parameter shape #54

Open naz opened 3 years ago

naz commented 3 years ago

The problem I'm facing is a confusing (from job developer perspective) use of the data (workerData) passed into the worker. After looking closer into how workerData parameter is constructed, I have a proposal on how to structure it so it's more intuitive for the job developer.

Let's consider an example bree job initialization with a value passed in:

// bree client configuring a job with passed in data

const job = {
    cron:'0/5 * * * * *',
    name:'job-with-data',
    path:'/path/to/job-using-data.js',
    worker:{
      workerData: 42
    }
};

bree.add(job);
bree.start('job-with-data');
// job-using-data.js file content

const {workerData} = require('worker_threads');

(async () => {
    const data = workerData.job.worker.workerData;
    console.log(data); // prints '42'
})();

To get access to the data inside of the job file job developer has have knowledge about quite complicated bree's internal object structure workerData.job.worker.workerData. This structure doesn't match 1:1 to the parameter passed during the job initialization. As a developer writing a job, I would expect to have access to same data structure as if I initialized a job through native Worker constructor:

new Worker('/path/to/job-using-data.js', {workerData: 42});
// job-using-data.js file content

const {workerData} = require('worker_threads');

(async () => {
    const data = workerData;
    console.log(data); // prints '42'
})();

I don't see a good reason why bree job would need to have all the meta information about itself accessible from within the script. I think it's unnecessary for the job to know about it's name, interval or any other bree's internal configurations. It should be up to bree's client to use these configurations and the job itself should execute in self contained manner, with as little knowledge as possible. @niftylettuce what are your thoughts? Could you please clarify the usecase for passing in job metadata into workerData parameter?

Proposed change

I think initializing worker in following way would make reading workerData inside of jobs most intuitive:

const workerData = job.worker && job.worker.workerData ? job.worker.workerData : undefined;
const object = {
  ...(this.config.worker ? this.config.worker : {}),
  ...(job.worker ? job.worker : {}),
  workerData
};

this.workers[name] = new threads.Worker(job.path, object);
shadowgate15 commented 3 years ago

I'm pretty sure this is the way workerData is working now...am I incorrect?

nickperkinslondon commented 3 years ago

No, this is till an issue I think, I just found this issue, and it helped me add this to my job code:

{ Worker, isMainThread, workerData } = require('worker_threads');
my_data = workerData.job.worker.workerData;

...to avoid a bunch of strange data ( bree internal stuff? ) Now I just get the data I was expecting.