flojoy-ai / studio

Joyful visual programming for Python
https://docs.flojoy.ai
MIT License
197 stars 19 forks source link

[STU 192] - RFC - (Reactive) Blocks Developer Experience #982

Open TheBigSasha opened 9 months ago

TheBigSasha commented 9 months ago

Mirror of STU-192 on Linear

Question: How do we make it easy and powerful to write flojoy nodes?

Subquestion: What is the purpose of a flojoy node?

Subquestion: What control vs abstraction trade offs do we want to make?

Principles:

Note regarding pseudocode style: The pseudocode in this document resembles Java or TypeScript. A bit of mixed syntax, picking language features from both as convenient for brevity and specificity; the real python code will probably look quite different. The use of generics isn't completely valid but should get the point across about which types go where. Feel free to reach out to me (@TheBigSasha ) if you have any questions.

Design Proposal

FJBlock<In, Out> make_lambda_block(run: (input: In, hooks: Function[]) => Out) {
  // we pass in class method "memoize" as a hook to the lambda because it lets us have some more built in functions (hooks) to manage the block without class access
  return new FJBlock<In, Out>() {
    //...the rest of the implenentation of abstract methods
    on_next(input) {
       this.publish(run(input, [this.memoize]));
    }
  }
}

class NAxisBlock<{controllerID: int}, NAxisState>{ this.controller = null;

void onNext(input) { memoize(() => { if (this.controller != null){ this.controller.destruct(); } this.controller = getGameController(input.id);

  this.controller.onInput((input: NAxisState) => {
     this.publish(input);
   }
}, [input.controllerId])

} }

- Legacy blocks should be easy to migrate. If we can't use lambda blocks, can we come up with a class block that fully encompasses the function of retro style blocks, IE generates a class block given the function and python doc for a classic flojoy block

```typescript
FJBlock<In, Out> import_legacy_block(block: Function, docstring: Docstring){
    // Write a function to turn a legacy block into an FJBlock instance. Use the docstring for the parameters, title, etc. And use the on_next for running.
}

Function Blocks, Hooks

A block can contain hooks for effects/memoization, and internal state which abstract away the RX subjects.

Publish

The publish or "state" hook works just like ReactJS useState — upon calls to publish (setState), changes are propagated. Otherwise, changes to state do not propagate in the reactive compute graph. This can be used for hardware support, like so:

const oddsOnly =  (next, hooks) => { 
    if (next % 2 != 0){
     hooks.publish(next);
   }
    return;
}

Instead of returning nulls, this function block filters out evens and propagates only odds by using the publish hook.

Memoize / Effect

The memoize hook is particularly useful to ML or other long compute tasks. It takes a lambda and a list of parameter keys, and it runs the lambda once, and then afterwards only when one of the parameter keys changes. Think useMemo and useEffect in React terms. There is no distinction here, since React distinguishes these based on time of run, primarily for DOM access, which is not an issue for us.

TheBigSasha commented 9 months ago

Suggestion: Destruct method to blocks to run before flowchart teardown, called when stop signal is recieved.

39bytes commented 9 months ago

Thanks Sasha for this writeup, it's a nice roadmap for things to implement in the next few weeks, I just had a few questions while reading.

  /**
    **NORTH STAR FEATURE**
    return the input names and types of the class, by default use reflection to return the In type, but when In is an aribtrary object, allow fully custom outputs.
    This is useful for highly generic nodes like ROS pub sub. Beats using a JSON input / output
  */
  public default Pair<String, String>[] getFlowchartInputs();

I'm not exactly sure what you mean by "fully custom outputs" when In is some arbitrary object, if this is a generic wouldn't it just be a type variable? Python has support for generics via TypeVar so we could just use reflection all the time: (This is python 3.12 syntax, but you can still use generics in 3.11 and earlier)

def foo[T](a: T, b: T) -> T:
    pass
  public default Subject ___get_subject___();
  /**
     a safe user-oriented abstraction of the observable to publish data to the flowchart
  */
  protected default void publish(Out out);

Here, you expose the reactiveX parts in the class. This class mainly looks like an extension of the FCBlockIO class we created during the initial scaffolding of studiolab. In this case, since you want to be able to control exactly when the publish occurs, wouldn't we have to change out to be a subject instead of an observable to control when to push updates?

With the lambda block factory, it makes sense as an easy way to wrap existing blocks, but one thing that was confusing me is that the pseudocode instantiates a FJBlock which is abstract:

FJBlock<In, Out> make_lambda_block(run: (input: In, hooks: Function[]) => Out) {
  // we pass in class method "memoize" as a hook to the lambda because it lets us have some more built in functions (hooks) to manage the block without class access
  return new FJBlock<In, Out>() {
    //...the rest of the implenentation of abstract methods
    on_next(input) {
       this.publish(run(input, [this.memoize]));
    }
  }
}

I think in the actual implementation we would just have a LambdaBlock class that takes a block function in its constructor and fills in some defaults.

class LambdaBlock(FJBlock):
    def __init__(self, block_func):
        self.func = block_func
        ...

    def on_next(*args, **kwargs):
        return self.func(*args, **kwargs):

    # other default implementations

Also, since you're injecting the hooks through the function parameters, this means when migrating all of the existing function blocks to this format, we would have to add an optional hooks parameter for all of them. Personally, I think it would be fine to restrict the use of effects/memoize/state to class-based blocks only. I think the majority of blocks don't need hooks, and those that do can easily be converted into classes inheriting from FJBlock.

TheBigSasha commented 9 months ago
  /**
    **NORTH STAR FEATURE**
    return the input names and types of the class, by default use reflection to return the In type, but when In is an aribtrary object, allow fully custom outputs.
    This is useful for highly generic nodes like ROS pub sub. Beats using a JSON input / output
  */
  public default Pair<String, String>[] getFlowchartInputs();

I'm not exactly sure what you mean by "fully custom outputs" when In is some arbitrary object, if this is a generic wouldn't it just be a type variable? Python has support for generics via TypeVar so we could just use reflection all the time: (This is python 3.12 syntax, but you can still use generics in 3.11 and earlier)

For this, I was thinking about fully runtime defined I/O. For example, a web API node whose output is an object with the fields of what's queried from the API, or a hardware node which outputs differently based on what's connected. I put it as a "north star" feature because it may be difficult to implement or narrowly useful.

  public default Subject ___get_subject___();
  /**
     a safe user-oriented abstraction of the observable to publish data to the flowchart
  */
  protected default void publish(Out out);

Here, you expose the reactiveX parts in the class. This class mainly looks like an extension of the FCBlockIO class we created during the initial scaffolding of studiolab. In this case, since you want to be able to control exactly when the publish occurs, wouldn't we have to change out to be a subject instead of an observable to control when to push updates?

Yes we may have to make changes to out to enable this API, I think the internal reactive structure of the node will need a rework, but that implementational detail may be something we can iron out after establishing the design from a developer interface perspective.

With the lambda block factory, it makes sense as an easy way to wrap existing blocks, but one thing that was confusing me is that the pseudocode instantiates a FJBlock which is abstract:

This is an anonymous class, some relatively arcane Java syntax. My bad for using something that isn't all too common. It's a shorthand for directly instantiating an abstract class by filling in any abstract methods on instantiation. In effect it is the Java equivalent of a method taking a function as input.

FJBlock<In, Out> make_lambda_block(run: (input: In, hooks: Function[]) => Out) {
  // we pass in class method "memoize" as a hook to the lambda because it lets us have some more built in functions (hooks) to manage the block without class access
  return new FJBlock<In, Out>() {
    //...the rest of the implenentation of abstract methods
    on_next(input) {
       this.publish(run(input, [this.memoize]));
    }
  }
}

I think in the actual implementation we would just have a LambdaBlock class that takes a block function in its constructor and fills in some defaults.

If that's a better way to handle this idea in Python, I think that's fair. There isn't a technical reason it should be a factory instead of a constructor.

class LambdaBlock(FJBlock):
    def __init__(self, block_func):
        self.func = block_func
        ...

    def on_next(*args, **kwargs):
        return self.func(*args, **kwargs):

    # other default implementations

Also, since you're injecting the hooks through the function parameters, this means when migrating all of the existing function blocks to this format, we would have to add an optional hooks parameter for all of them. Personally, I think it would be fine to restrict the use of effects/memoize/state to class-based blocks only. I think the majority of blocks don't need hooks, and those that do can easily be converted into classes inheriting from FJBlock.

I am imagining that the function blocks and hooks are going to be used primarily for short script or transformation nodes written in an in-studio editor. I wouldn't want someone jumping into Flojoy to have to learn the shape of the FJBlock class unless they need to, which they don't if they're making a simple custom block that does some math or something. Do you think there's a better way to approach that use case?

My goal with the function block was dual:

These two goals can also be split off into 2 different implementations if that is deemed to be optimal.

IsabelParedes commented 9 months ago

Hi! From the ROS side of things, with this new proposal it will solve:

Could you elaborate on the plan for handling blocks with special dependencies? Would this be the same as it was previously?

@flojoy(deps={"torch": "2.0.1", "torchvision": "0.15.2"})
def DEEPLAB_V3(default: Image) -> Image:
...

Would all dependencies be handled with poetry?

TheBigSasha commented 9 months ago

Hi! From the ROS side of things, with this new proposal it will solve:

Could you elaborate on the plan for handling blocks with special dependencies? Would this be the same as it was previously?

@flojoy(deps={"torch": "2.0.1", "torchvision": "0.15.2"})
def DEEPLAB_V3(default: Image) -> Image:
...

Would all dependencies be handled with poetry?

I don't see any reason we can't keep using the old dependency management system. I would say it will stay completely the same. Maybe the annotation above the class instead of the function. @itsjoeoui Thoughts?

jjerphan commented 9 months ago

Hi @TheBigSasha, thank you for writing this comprehensive RFC.

It seems the "Publish" subsection misses a sentence at its end.

IIRC, poetry was used because conda-lock has some slowness or because conda-forge does not ship all the packages (e.g. for instance the headless version of opencv which is needed since OpenCV's license is tight to Qt's).

Let's wait for @itsjoeoui's answer?

Apart from that, import_legacy_block looks like a good approach a priori to port the all nodes' code to the new blocks.

TheBigSasha commented 9 months ago

Hi @TheBigSasha, thank you for writing this comprehensive RFC.

It seems the "Publish" subsection misses a sentence at its end.

IIRC, poetry was used because conda-lock has some slowness or because conda-forge does not ship all the packages (e.g. for instance the headless version of opencv which is needed since OpenCV's license is tight to Qt's).

Let's wait for @itsjoeoui's answer?

Apart from that, import_legacy_block looks like a good approach a priori to port the all nodes' code to the new blocks.

Thanks for the note, I updated the publish section with an example :)