threefoldtech / tfgrid-sdk-ts

Apache License 2.0
4 stars 8 forks source link

`sdk-ts` : Support multiple stacks per network #3078

Open xmonader opened 2 months ago

0oM4R commented 2 months ago

Issue update : started to implement the logic but can't use the monitor package image

0oM4R commented 2 months ago

also current mentor services logs the aliveness of the services but do not return any values,i think it need some refactor adding new methods

0oM4R commented 2 months ago

WIP: creating the logic to pick the first available stack per service add this init class to the alivenessCheck in monitoring

export class serviceStackPicker {
  private result: ServicesUrls = {};
  constructor(public options: StackPickerOptions) {}

  private async pingService(service: ILivenessChecker) {
    const status = await service.isAlive();
    if ("disconnect" in service) {
      await (service as IDisconnectHandler).disconnect();
    }
    return status;
  }
  async GetAvailableServices(): Promise<ServicesUrls> {
    if ("tfChain" in this.options)
      this.result.tfChain = await this.getAvailableServiceStack(
        this.options.tfChain.slice(1),
        new TFChainMonitor(this.options.tfChain[0]),
      );
    if ("gridProxy" in this.options)
      this.result.GirdProxy = await this.getAvailableServiceStack(
        this.options.GirdProxy.slice(1),
        new GridProxyMonitor(this.options.GirdProxy[0]),
      );

    return this.result;
  }
  async getAvailableServiceStack(urls: string[], service: ILivenessChecker) {
    let index = 0;
    do {
      const status = await this.pingService(service);
      if (status.alive) return service.serviceUrl();
      console.log(`${service.serviceName()}: failed to ping ${service.serviceUrl()}, due to ${status.error}`);
      service.setServiceURl(urls[index++]);
    } while (index < urls.length);
  }
}

and should be called anywhere with

 const test = new serviceStackPicker({
    tfChain: ["faf", "wss://tfchain.dev.grid.tf/ws", "wsss://tfchain.dev.grid.tf/ws"],
    GirdProxy: ["hahah", "https://gridproxy.dev.grid.tf"],
  });
  console.log(await test.GetAvailableServices());

also we had to add setServicesUrl method to IServiceBase and add some changes to update the effected props

this snips is the initial phase of coding, feel free to suggest any refactors

0oM4R commented 2 months ago

Issue Update: monitoring part is almost ready, will support accessing the pick function directly and make the #3105 ready for review.

0oM4R commented 2 months ago

Issue update: was investigating how to read and pars array form user, but i can't reach anything;

WIP: creating a script to read from user the stacks for each service and then will convert it to string and export it current behavior image

AhmedHanafy725 commented 2 months ago

it shouldn't be interactive. you can export them with comma-separate then split them in the script

0oM4R commented 2 months ago

it shouldn't be interactive. you can export them with comma-separate then split them in the script

done export target stacks separated with comma will work fine image

0oM4R commented 2 months ago

Issue update: https://github.com/threefoldtech/tfgrid-sdk-ts/pull/3105 got some changes:

0oM4R commented 2 months ago

Issue update:

3105 had some refactors

WIP: finding a way to integrate this logic with playground

0oM4R commented 1 month ago

Issue update: applied pr comments and introduce new features

0oM4R commented 1 month ago

New monitoring pr is ready : https://github.com/threefoldtech/tfgrid-sdk-ts/pull/3134

0oM4R commented 1 month ago

All review comments applied on https://github.com/threefoldtech/tfgrid-sdk-ts/pull/3134

0oM4R commented 1 month ago

Issue update: fixed some issues in monitroring while integrating it in UI,

Blocker: https://github.com/threefoldtech/tfgrid-sdk-ts/blob/0a5568e40f31ebe8399e59ef31eedc6d75eefe02/packages/playground/src/clients/index.ts#L1-L10

those clients got loaded before initializing the envs and got undifined urls, suggest moving those clients to grid store

0oM4R commented 1 month ago

Issue Update: Monitoring integrated in playground Blocker: Not sure yet should we provide fake mnemonic for RMB monitor or what

Screenshot from 2024-07-24 19-14-39

0oM4R commented 1 month ago

all requested changes applied, and ui is ready as well.

0oM4R commented 1 month ago

As we discussed with @sameh-farouk, we can use chain /health endpoint to verify node rpc status, if it responds with 200 OK, then it is alive and we can rely on it

0oM4R commented 1 month ago

Issue update

Blocker while pinging all urls in parallel, by passing url to the alive method, all urls give an error even if one of them is reachable work on testing branch

0oM4R commented 1 month ago

Issue update: I added the required fallback mechanism, but I'm facing a very wired issue, when we have more than one invalid url in the stack, the whole requested urls for all services got effected and gives Timeout error Screenshot from 2024-07-30 18-43-53

but if we remove one of the invalid/unreachable urls from the array it works fine. https://github.com/threefoldtech/tfgrid-sdk-ts/blob/889190d255a1b5a762ca47a6c880e10f30e824a7/packages/monitoring/example/serviceURLManager.ts#L10

details of the new applied mechanism:

Concerns: this approach takes a lot of time, as Promise.allSattled waits for all stacks to resolve so if i have 2 urls first one has a response on the first try, and the second is unreachable i have to wait for the second one to exhausts all its retries, for our case, with base timeout 2 it will take about 12 seconds to response with the first url that already replied within the first 2 seconds image

0oM4R commented 1 month ago

Issue update : resolved the issue of fetch by providing valid or existing urls work completed: support url monitor in gridclient, code needs some cleanup and test then the pr will be ready

0oM4R commented 1 month ago

I faced an issue while testing playground integration: getDefaultUrls had braking changes as it used in playground and tests so will create new function that contain the new added logic

0oM4R commented 1 month ago

work completed:

A-Harby commented 4 days ago

Verified, Devnet.

The stack is working fine, checking all the URLs at once with 3 retries and picking the working one in the same order as they were written.

For one failing but the other working. image

For all of them to fail. image

For Grid client. image

Created an issue for the Grid Client to work as config in playground: https://github.com/threefoldtech/tfgrid-sdk-ts/issues/3399.

New test Case: TC2825 - Run locally with multiple stack