Jaseci-Labs / jaseci

The Official Jaseci Code Repository
https://jaseci.org
155 stars 208 forks source link

[Bug] [Jac Cloud] Private & Protected Walker Attributes are queryable in endpoint #1338

Closed ChrisIsKing closed 4 days ago

ChrisIsKing commented 2 weeks ago

Describe the bug

Private (:priv) and protected (:protect) walker attributes are queryable in Jac Cloud endpoint responses, which violates the expected encapsulation of such attributes. The expected behavior is that only public attributes (:pub) should be visible, while private and protected attributes should remain inaccessible to external queries.

To Reproduce

  1. Create a walker with private, protected, and public attributes as shown below.
  2. Use an endpoint to query the walker's state after execution.
  3. Observe that both private (var_a) and protected (var_b) attributes are included in the endpoint request body and the endpoint response, even though they should not be.
walker test_walker {
    has :priv var_a: `Any = "";
    has :protect var_b: `Any = "";
    has :pub var_c: `Any = "";

    obj __specs__ {
        static has auth: bool = False;
    }

    can update_values with `root entry {
        self.var_a = self.var_a + "Updated value";
        self.var_b = self.var_b + "Updated value";
        self.var_c = self.var_c + "Updated value";
    }

    can report_value with `root exit {
        report self;
    }
}
Screenshot 2024-10-04 at 12 18 46 AM Screenshot 2024-10-04 at 12 19 01 AM
amadolid commented 1 week ago

private, protected, public approach might be for different context like inheritance and encapsulation.

We can use additional option on __specs__, however, It might cause some restriction.

I think the right way to do it is by using python's behavior (ClassVar). Although, it's declared as Class' variable, we usually do it for type hinting that the current class has this variable. It will also work with inheritance and they may set it as optional variable.

from dataclasses import dataclass, field
from typing import ClassVar

@dataclass
class WithPrivate1:
    immutable_var: int = 1
    mutable_var: dict = field(default_factory=dict)

    var_for_type_hint: ClassVar[int]

# same with
class WithPrivate2:
    var_for_type_hint: int

    def __init__(self, immutable_var: int = 1, mutable_var: dict = None) -> None:
        self.immutable_var = immutable_var
        self.mutable_var = mutable_var or {}

using has static val: int will be converted to val: ClassVar[int]

walker sample{
    has static val: int;
}
ChrisIsKing commented 1 week ago

Yes, I agree that private, protected and public are for preserving inheritance and encapsulation. So how do you cater for the case where you want the variable to be instance specific but also not exposed on the api. Since static vars are shared among all instances of the class, future instances of the walker may run into issues when accessing these vars no?.

amadolid commented 1 week ago

Python handles ClassVar (equivalent to has static var) differently .

Since Python doesn't statically define variables in a class similar to other OOP languages, ClassVar can be used in different purposes.

For example:

class Sample:
    val1: int = 1
    val2: int
    # similar to 
    # val1: ClassVar[int] = 1
    # val2: ClassVar[int]

val1 will be a class variable that's accessible in instance and child classes/instances. this can be access via self.val1 or Sample.val1

val2 is different. Since we don't specify it's value, it will not be available in instance and child classes/instances. However, type hinting will still be available. It's like Sample classs/instance "should" have val2 with int type. This will require to be assigned first before it can be accessed.

val1 and val2 might be different but both can still be utilize as instance variable. assigning variable to both using instance variable (ex: self) will only assign it to the instance. It will not update class variable value.

s = Sample()
s.val1 = 10

print(s.val1) # prints 10
print(Sample.val1) # prints 1

print(s.val2) # will error
s.val2 = 1
print(s.val2) # prints 1

In this context, I think using has static var is the right way to have a private like field for walker API

amadolid commented 1 week ago

adding option to hide fields using __specs__ might add some confusion/restriction. Since that field will no longer be available in API request, we will require dev to set a default value/factory. We will also require them to override __serialize__ and __deserialize__ if they wish to don't include it on db or response

However, I will add it just to support this edge case: Hidden in API, included in db and don't want to override __serialize__ and __deserialize__

ypkang commented 5 days ago
from __future__ import annotations
import typing as _jac_typ
from jaclang.plugin.feature import JacFeature as _Jac
from jaclang.plugin.builtin import *
from dataclasses import dataclass as __jac_dataclass__

@_Jac.make_walker(on_entry=[_Jac.DSFunc('update_values', _Jac.RootType)], on_exit=[_Jac.DSFunc('report_value', _Jac.RootType)])
@__jac_dataclass__(eq=False)
class test_walker(_Jac.Walker):
    var_a: _jac_typ.Any = _Jac.has_instance_default(gen_func=lambda: '')
    var_b: _jac_typ.Any = _Jac.has_instance_default(gen_func=lambda: '')
    var_c: _jac_typ.Any = _Jac.has_instance_default(gen_func=lambda: '')

    @_Jac.make_obj(on_entry=[], on_exit=[])
    @__jac_dataclass__(eq=False)
    class __specs__(_Jac.Obj):
        auth: _jac_typ.ClassVar[bool] = False

    def update_values(self, _jac_here_: _Jac.RootType) -> None:
        self.var_a = self.var_a + 'Updated value'
        self.var_b = self.var_b + 'Updated value'
        self.var_c = self.var_c + 'Updated value'

    def report_value(self, _jac_here_: _Jac.RootType) -> None:
        _Jac.report(self)
ChrisIsKing commented 5 days ago

@marsninja We noticed that the bytecode generated for vars declared as private, protected and public we no different. Is this the intended behavior?

ChrisIsKing commented 5 days ago

Support for excluding vars from endpoint was added in #1364. However behavior of private, protected and public has not yet been resolved.

amadolid commented 4 days ago

Hi @ChrisIsKing , @ypkang , @marsninja ,

Here's some insight in a BE perspective.

I don't think private, protected and public should be used in this context. The reason for it is, it's like we're implementing OOP but not properly. Those access modifier is supposed to be for inheritance and encapsulation. Let's assume python support that (even tho it's not). This feature should not affect how the object will be represent during communication. For example, node to db or node to response.

Here's what I mean by that. On a normal entity class in java, we always used private fields. This is not to hide it from response but purely for encapsulations. Getters are usually used for serialization if no other serialization is specified.

class Person {
    private String name;
    private int age;

    // getters and setters

    @JsonGetter("full_name") // optional for naming
    public String getName() {
        return name;
    }

    @JsonSetter("full_name") // optional for naming
    public void setName(String name) {
        this.name = name;
    }

    public int getAge() {
        return age;
    }

    public void setAge(int age) {
        this.age = age;
    }
}

When we are communicating to db, even tho fields are private, since it was serialized, we are still able to save all fields.

In regards to a response, let say we want to hide age, here is where a Data Transfer Object comes in handy. We usually convert entity models to a DTO to have a clearer and safer response. In this approach, we can also control the return type and the actual return.

We also use DTO as request structure for proper validation.


In Jaseci perspective, we can still follow the rule without using private, protected, and public

obj CustomResponse {
     has id: str;
}

node A {
    has val: int;
}

walker Testing {
    can enter with `root entry {
        self.a = here ++> A(val=1);
    }

    on disengaged -> CustomResponse  { # or with exit -> CustomResponse
        return CustomResponse(jid(self.a));
    }
}

In this approach, we will be lesser to override serialization because every object are used and utilized in their respective calls.

Also walker's can act as the actual Request DTO.

ChrisIsKing commented 4 days ago

@amadolid Yh I understand what you're saying. Couple things I believe came out of this:

amadolid commented 4 days ago

@amadolid Yh I understand what you're saying. Couple things I believe came out of this:

* `private`, `protected` and `public` are treated no differently in jaclang currently. This needs to be fixed.

* We need a more readable way to delineate what should be exposed as params on the endpoint. The excluded var in `__specs__` is one way of doing this but it would be nice to do this at the language level as well.

Yeah, actually that's also the reason why I'm suggesting ClassVar approach. You can also not specify it and just use self.variable_name = value.

In that sense, walker will be the actual DTO and the additional self.variable_name will be the runtime variables.

amadolid commented 4 days ago
  • We need a more readable way to delineate what should be exposed as params on the endpoint. The excluded var in __specs__ is one way of doing this but it would be nice to do this at the language level as well.

In regards to this, specifying the expected parameter for the endpoint should be the language level approach right ? if you want to hide it from api, why declare it in the first place? I mean you also assign the value at runtime too

The exclude on specs is just for edge case and it's not recommended.

Here's the sample implementation

walker test_walker {
    static has var_a: `Any, var_b: `Any, var_c: `Any;

    obj __specs__ {
        static has auth: bool = False;
    }

    can update_values with `root entry {
        self.var_a = "Updated value";
        self.var_b = "Updated value";
        self.var_c = "Updated value";
    }

    can report_value with `root exit {
        report self;

        report self.__dict__;
    }
}

image

ChrisIsKing commented 4 days ago

Closing issue. This can be supported by the use of:

  1. has static var;
  2. has var: ClassVar[type]

We will discuss the handling of private, protected and public in a separate issue.