terascope / teraslice

Scalable data processing pipelines in JavaScript
https://terascope.github.io/teraslice/
Apache License 2.0
50 stars 13 forks source link

overview of issues in terafoundation #3587

Open jsnoble opened 3 months ago

jsnoble commented 3 months ago

As we convert files over to ESM modules, there are things that are not able to be done as compared to common js files. One of them in particular that that ESM modules DO NOT allow a sync require` of code, it has to be async. This changes a lot of our code in terafoundation and in job-components

function requireConnector(filePath: string, errors: ErrorResult[]) {
    let mod = require(filePath);

    if (mod && mod.default) {
        mod = mod.default;
    }
    let valid = true;
    if (typeof mod !== 'object') {
        valid = false;
    }

    if (mod && typeof mod.config_schema !== 'function') {
        errors.push({
            filePath,
            message: `Connector ${filePath} missing required config_schema function`,
        });
        valid = false;
    }

    if (mod && typeof mod.create !== 'function') {
        errors.push({
            filePath,
            message: `Connector ${filePath} missing required create function`
        });
        valid = false;
    }

    if (valid) return mod;
    return null;
}

This code is one of the major underlying apis for fetching connectors and the like

in that code the require statements needs to be turned into let mod = await import(filePath); This now make the function async

Since this low level api is change now all the things built on top of them have to be async as well. The biggest one of all is the validateConfigs function which is used to create the sysconfig.

This causes a problem in our code

export class ProcessContext<
    S = Record<string, any>,
    A = Record<string, any>,
    D extends string = string
> extends CoreContext<S, A, D> {
    constructor(
        config: i.FoundationConfig<S, A, D>,
        overrideArgs?: i.ParsedArgs<S>
    ) {
        const cluster: i.Cluster = {
            isMaster: false,
            worker: {
                id: nanoid(8),
            }
        } as any;

        const parsedArgs = overrideArgs || getArgs<S>(
            config.default_config_file
        );

        const sysconfig = validateConfigs(cluster, config, parsedArgs.configfile);

        super(
            config,
            cluster,
            sysconfig,
        );
    }
}

validateConfigs has to be async now, so this breaks the code, and javascript does not really allow async actions in a constructor, so we will have to change this to have an async initialize function or the like to properly start this up, which will change how teraslice and spaces uses these

godber commented 3 months ago

Reference from Jared's post: