cloud-custodian / cel-python

Pure Python implementation of the Common Expression Language
Apache License 2.0
99 stars 20 forks source link

Performance of evaluation #68

Open NoamSherr opened 1 week ago

NoamSherr commented 1 week ago

Hey there,

I encountered some performance benchmarks of certain CEL expressions evaluations using the library and wanted to know if that's the expected outcome. I saw https://github.com/cloud-custodian/cel-python/wiki/Early-Profiling-Data and https://github.com/cloud-custodian/cel-python/blob/main/benches/large_resource_set.py and wanted to validate performance time for large data sets.

The expression I tried running is the following - ((!has(class_a.property_a) ? false : ("Linux" == class_a.property_a)) && ((!has(class_b.property_b) ? false : class_b.property_b.contains("os:/o:centos:centos:")) || (!has(class_b.property_b) ? false : class_b.property_b.contains("x-os:/o:centos:centos:")) || (!has(class_b.property_b) ? false : class_b.property_b.contains("os:/a:centos:centos:")) || (!has(class_b.property_b) ? false : class_b.property_b.contains("x-os:/a:centos:centos:")) || (!has(class_b.property_b) ? false : class_b.property_b.contains("p-os:/a:centos:centos:")))) ? "Linux Team" : (((!has(class_a.property_a) ? false : ("Linux" == class_a.property_a)) && ((!has(class_b.property_b) ? false : class_b.property_b.contains("os:/o:debian:debian_linux:")) || (!has(class_b.property_b) ? false : class_b.property_b.contains("x-os:/o:debian:debian_linux:")) || (!has(class_b.property_b) ? false : class_b.property_b.contains("os:/a:debian:debian:")) || (!has(class_b.property_b) ? false : class_b.property_b.contains("x-os:/a:debian:debian:")) || (!has(class_b.property_b) ? false : class_b.property_b.contains("p-os:/a:debian:debian_linux:")))) ? "Linux Team" : (((!has(class_a.property_a) ? false : ("Linux" == class_a.property_a)) && ((!has(class_b.property_b) ? false : class_b.property_b.contains("os:/o:fedoraproject:fedora:")) || (!has(class_b.property_b) ? false : class_b.property_b.contains("x-os:/o:fedoraproject:fedora:")) || (!has(class_b.property_b) ? false : class_b.property_b.contains("os:/a:fedoraproject:fedora:")) || (!has(class_b.property_b) ? false : class_b.property_b.contains("x-os:/a:fedoraproject:fedora:")) || (!has(class_b.property_b) ? false : class_b.property_b.contains("p-os:/a:fedoraproject:fedora:")))) ? "Linux Team" : (((!has(class_a.property_a) ? false : ("Linux" == class_a.property_a)) && ((!has(class_b.property_b) ? false : class_b.property_b.contains("os:/o:oracle:linux:")) || (!has(class_b.property_b) ? false : class_b.property_b.contains("x-os:/o:oracle:linux:")) || (!has(class_b.property_b) ? false : class_b.property_b.contains("os:/a:oracle:linux:")) || (!has(class_b.property_b) ? false : class_b.property_b.contains("x-os:/a:oracle:linux:")) || (!has(class_b.property_b) ? false : class_b.property_b.contains("p-os:/a:oracle:linux:")))) ? "Linux Team" : (((!has(class_a.property_a) ? false : ("Linux" == class_a.property_a)) && ((!has(class_b.property_b) ? false : class_b.property_b.contains("os:/o:redhat:enterprise_linux:")) || (!has(class_b.property_b) ? false : class_b.property_b.contains("x-os:/o:redhat:enterprise_linux:")) || (!has(class_b.property_b) ? false : class_b.property_b.contains("os:/a:redhat:enterprise_linux:")) || (!has(class_b.property_b) ? false : class_b.property_b.contains("x-os:/a:redhat:enterprise_linux:")) || (!has(class_b.property_b) ? false : class_b.property_b.contains("p-os:/a:redhat:enterprise_linux:")))) ? "Linux Team" : (((!has(class_a.property_a) ? false : ("Linux" == class_a.property_a)) && ((!has(class_b.property_b) ? false : class_b.property_b.contains("os:/o:novell:suse_linux:")) || (!has(class_b.property_b) ? false : class_b.property_b.contains("x-os:/o:novell:suse_linux:")) || (!has(class_b.property_b) ? false : class_b.property_b.contains("os:/a:novell:suse_linux:")) || (!has(class_b.property_b) ? false : class_b.property_b.contains("x-os:/a:novell:suse_linux:")) || (!has(class_b.property_b) ? false : class_b.property_b.contains("p-os:/a:novell:suse_linux:")))) ? "Linux Team" : (((!has(class_a.property_a) ? false : ("Linux" == class_a.property_a)) && ((!has(class_b.property_b) ? false : class_b.property_b.contains("os:/o:canonical:ubuntu_linux:")) || (!has(class_b.property_b) ? false : class_b.property_b.contains("x-os:/o:canonical:ubuntu_linux:")) || (!has(class_b.property_b) ? false : class_b.property_b.contains("os:/a:canonical:ubuntu_linux:")) || (!has(class_b.property_b) ? false : class_b.property_b.contains("x-os:/a:canonical:ubuntu_linux:")) || (!has(class_b.property_b) ? false : class_b.property_b.contains("p-os:/a:canonical:ubuntu_linux:")))) ? "Linux Team" : (((!has(class_a.property_a) ? false : ("Linux" == class_a.property_a)) && ((!has(class_b.property_b) ? false : class_b.property_b.contains("os:/h:")) || (!has(class_b.property_b) ? false : class_b.property_b.contains("x-os:/h:")))) ? "Linux Team" : (((!has(class_a.property_a) ? false : ("Windows Server" == class_a.property_a)) && ((!has(class_b.property_b) ? false : class_b.property_b.contains("os:/o:microsoft:")) || (!has(class_b.property_b) ? false : class_b.property_b.contains("x-os:/o:microsoft:")) || (!has(class_b.property_b) ? false : class_b.property_b.contains("os:/a:microsoft:")) || (!has(class_b.property_b) ? false : class_b.property_b.contains("x-os:/a:microsoft:")))) ? "Windows Team" : (((!has(class_a.property_a) ? false : ("Windows Server" == class_a.property_a)) && ((!has(class_b.property_b) ? false : class_b.property_b.contains("os:/h:")) || (!has(class_b.property_b) ? false : class_b.property_b.contains("x-os:/h:")))) ? "Windows Team" : (((!has(class_a.property_a) ? false : ("Linux" == class_a.property_a)) || (!has(class_a.property_a) ? false : ("Windows Server" == class_a.property_a))) ? "Unassigned" : (((!has(class_a.property_a) ? false : ("Windows Workstation" == class_a.property_a)) && ((!has(class_b.property_b) ? false : class_b.property_b.contains("os:/o:microsoft:")) || (!has(class_b.property_b) ? false : class_b.property_b.contains("x-os:/o:microsoft:")) || (!has(class_b.property_b) ? false : class_b.property_b.contains("os:/a:microsoft:")) || (!has(class_b.property_b) ? false : class_b.property_b.contains("x-os:/a:microsoft:")))) ? "Solutions Team" : ((!has(class_a.property_a) ? false : ("Windows Workstation" == class_a.property_a)) ? "End User" : "unknown")))))))))))) with the following context -

cel_context = {
                        "class_a": celpy.json_to_cel({"property_a":"something"}),
                        "class_b": celpy.json_to_cel({"title":"something else","property_b":"some var","integration_info":{"type":"GitHub"}}),
                        "optional": celpy.json_to_cel({})
                    }

The average I'm getting on my local machine (Macbook pro M1 Max, 32G RAM), is around 3.5 seconds per iteration - is this expected? As a comparison using https://github.com/google/cel-java takes 1 second for the same 10k iterations which is a exponentially faster. Is it possible to reach the java times? perhaps Lark in the evaluation is causing an issue here? I tried also passing to the env the CompiledRunner class to maybe boost performance, but seems like its calling super() for the evaluation which raises NotImplementedError - so not sure how to evaluate the expressions as pure python.

I tried smaller scripts - this one takes 0.2 seconds per iteration -

(!has(class_b.integration_info.type) ?
                false:("some value" == class_b.integration_info.type)) ?
                optional.of("some value") : optional.of("some other value")

Which is still significantly slower then the java.

(need to run it with these custom functions -

functions = {
    "of": lambda optional, value: value,
    "none": lambda optional, : None
}

)

I know there is redundancy in the big expression and that it can be optimized, but I'm using it intentionally since it is user-generated and can be encountered in my system.

I'm adding here a python script I used for the benchmarking - tried to wrap the processing code with cProfile but it just slowed things down even more so this is based on using the time.

Any help on this would be much appreciated,

Thanks,

import celpy
import celpy.celtypes
import time

CEL_EXPRESSION_ORIGINAL = """
(
    (
        !has(class_a.property_a) ? 
            false : ("Linux" == class_a.property_a)
    ) && 
    (
        (
            !has(class_b.property_b) ? 
                false : class_b.property_b.contains("os:/o:centos:centos:")
        ) || 
        (
            !has(class_b.property_b) ?
                false : class_b.property_b.contains("x-os:/o:centos:centos:")
        ) || 
        (
            !has(class_b.property_b) ?
                false : class_b.property_b.contains("os:/a:centos:centos:")
        ) || 
        (
            !has(class_b.property_b) ?
                false : class_b.property_b.contains("x-os:/a:centos:centos:")
        ) || 
        (
            !has(class_b.property_b) ? 
                false : class_b.property_b.contains("p-os:/a:centos:centos:")
        )
    )
) ? 
optional.of("Linux Team") : 
(
    (
        (
            !has(class_a.property_a) ?
                false : ("Linux" == class_a.property_a)
        ) && 
        (
            (
                !has(class_b.property_b) ?
                    false : class_b.property_b.contains("os:/o:debian:debian_linux:")
            ) || 
            (
                !has(class_b.property_b) ?
                    false : class_b.property_b.contains("x-os:/o:debian:debian_linux:")
            ) || 
            (
                !has(class_b.property_b) ?
                    false : class_b.property_b.contains("os:/a:debian:debian:")
            ) || 
            (
                !has(class_b.property_b) ? 
                    false : class_b.property_b.contains("x-os:/a:debian:debian:")
            ) || 
            (
                !has(class_b.property_b) ?
                    false : class_b.property_b.contains("p-os:/a:debian:debian_linux:")
            )
        )
    ) ?
optional.of("Linux Team") : 
(
    (
        (
            !has(class_a.property_a) ?
                false : ("Linux" == class_a.property_a)
        ) && 
        (
            (
                !has(class_b.property_b) ?
                    false : class_b.property_b.contains("os:/o:fedoraproject:fedora:")
            ) || 
            (
                !has(class_b.property_b) ? 
                    false : class_b.property_b.contains("x-os:/o:fedoraproject:fedora:")
            ) || 
            (
                !has(class_b.property_b) ? 
                    false : class_b.property_b.contains("os:/a:fedoraproject:fedora:")
            ) || 
            (
                !has(class_b.property_b) ?
                    false : class_b.property_b.contains("x-os:/a:fedoraproject:fedora:")
            ) || 
            (
                !has(class_b.property_b) ? 
                    false : class_b.property_b.contains("p-os:/a:fedoraproject:fedora:")
            )
        )
    ) ? 
optional.of("Linux Team") : 
(
    (
        (
            !has(class_a.property_a) ? 
                false : ("Linux" == class_a.property_a)
        ) && 
        (
            (
                !has(class_b.property_b) ? 
                    false : class_b.property_b.contains("os:/o:oracle:linux:")
            ) || 
            (
                !has(class_b.property_b) ? 
                    false : class_b.property_b.contains("x-os:/o:oracle:linux:")
            ) || 
            (
                !has(class_b.property_b) ? 
                    false : class_b.property_b.contains("os:/a:oracle:linux:")
            ) || 
            (
                !has(class_b.property_b) ? 
                    false : class_b.property_b.contains("x-os:/a:oracle:linux:")
            ) || 
            (
                !has(class_b.property_b) ? 
                    false : class_b.property_b.contains("p-os:/a:oracle:linux:")
            )
        )
    ) ? 
optional.of("Linux Team") : 
(
    (
        (
            !has(class_a.property_a) ? 
                false : ("Linux" == class_a.property_a)
        ) && 
        (
            (
                !has(class_b.property_b) ? 
                    false : class_b.property_b.contains("os:/o:redhat:enterprise_linux:")
        ) || 
        (
            !has(class_b.property_b) ? 
                false : class_b.property_b.contains("x-os:/o:redhat:enterprise_linux:")
        ) || 
        (
            !has(class_b.property_b) ? 
                false : class_b.property_b.contains("os:/a:redhat:enterprise_linux:")
        ) || 
        (
            !has(class_b.property_b) ? 
                false : class_b.property_b.contains("x-os:/a:redhat:enterprise_linux:")
        ) || 
        (
            !has(class_b.property_b) ? 
                false : class_b.property_b.contains("p-os:/a:redhat:enterprise_linux:")
        )
    )
) ? 
optional.of("Linux Team") : 
(
    (
        (
            !has(class_a.property_a) ? 
                false : ("Linux" == class_a.property_a)
        ) && (
            (
                !has(class_b.property_b) ? 
                    false : class_b.property_b.contains("os:/o:novell:suse_linux:")
            ) || 
            (
                !has(class_b.property_b) ? 
                    false : class_b.property_b.contains("x-os:/o:novell:suse_linux:")
            ) || 
            (
                !has(class_b.property_b) ? 
                    false : class_b.property_b.contains("os:/a:novell:suse_linux:")
            ) || 
            (
                !has(class_b.property_b) ? 
                    false : class_b.property_b.contains("x-os:/a:novell:suse_linux:")
            ) || 
            (
                !has(class_b.property_b) ? 
                    false : class_b.property_b.contains("p-os:/a:novell:suse_linux:")
            )
        )
    ) ? 
optional.of("Linux Team") : 
(
    (
        (
            !has(class_a.property_a) ? 
                false : ("Linux" == class_a.property_a)
        ) && 
        (
            (
                !has(class_b.property_b) ? 
                    false : class_b.property_b.contains("os:/o:canonical:ubuntu_linux:")
            ) || 
            (
                !has(class_b.property_b) ? 
                    false : class_b.property_b.contains("x-os:/o:canonical:ubuntu_linux:")
            ) || 
            (
                !has(class_b.property_b) ? 
                    false : class_b.property_b.contains("os:/a:canonical:ubuntu_linux:")
            ) || 
            (
                !has(class_b.property_b) ? 
                    false : class_b.property_b.contains("x-os:/a:canonical:ubuntu_linux:")
            ) || 
            (
                !has(class_b.property_b) ? 
                    false : class_b.property_b.contains("p-os:/a:canonical:ubuntu_linux:")
            )
        )
    ) ? 
optional.of("Linux Team") : 
(
    (
        (
            !has(class_a.property_a) ? 
                false : ("Linux" == class_a.property_a)
        ) && 
        (
            (
                !has(class_b.property_b) ? 
                    false : class_b.property_b.contains("os:/h:")
            ) || 
            (
                !has(class_b.property_b) ? 
                    false : class_b.property_b.contains("x-os:/h:")
            )
        )
    ) ? 
optional.of("Linux Team") : 
(
    (
        (
            !has(class_a.property_a) ? 
                false : ("Windows Server" == class_a.property_a)
        ) && 
        (
            (
                !has(class_b.property_b) ? 
                    false : class_b.property_b.contains("os:/o:microsoft:")
            ) || 
            (
                !has(class_b.property_b) ? 
                    false : class_b.property_b.contains("x-os:/o:microsoft:")
            ) || 
            (
                !has(class_b.property_b) ? 
                    false : class_b.property_b.contains("os:/a:microsoft:")
            ) || 
            (
                !has(class_b.property_b) ? 
                    false : class_b.property_b.contains("x-os:/a:microsoft:")
            )
        )
    ) ? 
optional.of("Windows Team") : 
(
    (
        (
            !has(class_a.property_a) ? 
                false : ("Windows Server" == class_a.property_a)
        ) && 
        (
            (
                !has(class_b.property_b) ? 
                    false : class_b.property_b.contains("os:/h:")
            ) || 
            (
                !has(class_b.property_b) ? 
                    false : class_b.property_b.contains("x-os:/h:")
            )
        )
    ) ? 
optional.of("Windows Team") : 
(
    (
        (
            !has(class_a.property_a) ? 
                false : ("Linux" == class_a.property_a)
        ) || 
        (
            !has(class_a.property_a) ?   
                false : ("Windows Server" == class_a.property_a)
        )
    ) ? 
optional.of("Unassigned") : 
(
    (
        (
            !has(class_a.property_a) ? 
                false : ("Windows Workstation" == class_a.property_a)
        ) && 
        (
            (
                !has(class_b.property_b) ? 
                    false : class_b.property_b.contains("os:/o:microsoft:")
            ) || 
            (
                !has(class_b.property_b) ? 
                    false : class_b.property_b.contains("x-os:/o:microsoft:")
            ) || 
            (
                !has(class_b.property_b) ? 
                    false : class_b.property_b.contains("os:/a:microsoft:")
            ) || 
            (
                !has(class_b.property_b) ? 
                    false : class_b.property_b.contains("x-os:/a:microsoft:")
            )
        )
    ) ? 
optional.of("Solutions Team") : 
(
    (
        !has(class_a.property_a) ? 
            false : ("Windows Workstation" == class_a.property_a)
    ) ? 
optional.of("End User") : optional.of("Unknown")))))))))))))
"""

CEL_EXPRESSION_SHORT = """
    "bla bla"
"""
CEL_EXPRESSION_MEDIUM = """
    (!has(class_b.integration_info.type) ?
                false:("some value" == class_b.integration_info.type)) ?
                optional.of("some value") : optional.of("some other value")
"""
CEL_EXPRESSION_ORIGINAL_NO_OPTIONAL = """
((!has(class_a.property_a) ? false : ("Linux" == class_a.property_a)) && ((!has(class_b.property_b) ? false : class_b.property_b.contains("os:/o:centos:centos:")) || (!has(class_b.property_b) ? false : class_b.property_b.contains("x-os:/o:centos:centos:")) || (!has(class_b.property_b) ? false : class_b.property_b.contains("os:/a:centos:centos:")) || (!has(class_b.property_b) ? false : class_b.property_b.contains("x-os:/a:centos:centos:")) || (!has(class_b.property_b) ? false : class_b.property_b.contains("p-os:/a:centos:centos:")))) ? "Linux Team" : (((!has(class_a.property_a) ? false : ("Linux" == class_a.property_a)) && ((!has(class_b.property_b) ? false : class_b.property_b.contains("os:/o:debian:debian_linux:")) || (!has(class_b.property_b) ? false : class_b.property_b.contains("x-os:/o:debian:debian_linux:")) || (!has(class_b.property_b) ? false : class_b.property_b.contains("os:/a:debian:debian:")) || (!has(class_b.property_b) ? false : class_b.property_b.contains("x-os:/a:debian:debian:")) || (!has(class_b.property_b) ? false : class_b.property_b.contains("p-os:/a:debian:debian_linux:")))) ? "Linux Team" : (((!has(class_a.property_a) ? false : ("Linux" == class_a.property_a)) && ((!has(class_b.property_b) ? false : class_b.property_b.contains("os:/o:fedoraproject:fedora:")) || (!has(class_b.property_b) ? false : class_b.property_b.contains("x-os:/o:fedoraproject:fedora:")) || (!has(class_b.property_b) ? false : class_b.property_b.contains("os:/a:fedoraproject:fedora:")) || (!has(class_b.property_b) ? false : class_b.property_b.contains("x-os:/a:fedoraproject:fedora:")) || (!has(class_b.property_b) ? false : class_b.property_b.contains("p-os:/a:fedoraproject:fedora:")))) ? "Linux Team" : (((!has(class_a.property_a) ? false : ("Linux" == class_a.property_a)) && ((!has(class_b.property_b) ? false : class_b.property_b.contains("os:/o:oracle:linux:")) || (!has(class_b.property_b) ? false : class_b.property_b.contains("x-os:/o:oracle:linux:")) || (!has(class_b.property_b) ? false : class_b.property_b.contains("os:/a:oracle:linux:")) || (!has(class_b.property_b) ? false : class_b.property_b.contains("x-os:/a:oracle:linux:")) || (!has(class_b.property_b) ? false : class_b.property_b.contains("p-os:/a:oracle:linux:")))) ? "Linux Team" : (((!has(class_a.property_a) ? false : ("Linux" == class_a.property_a)) && ((!has(class_b.property_b) ? false : class_b.property_b.contains("os:/o:redhat:enterprise_linux:")) || (!has(class_b.property_b) ? false : class_b.property_b.contains("x-os:/o:redhat:enterprise_linux:")) || (!has(class_b.property_b) ? false : class_b.property_b.contains("os:/a:redhat:enterprise_linux:")) || (!has(class_b.property_b) ? false : class_b.property_b.contains("x-os:/a:redhat:enterprise_linux:")) || (!has(class_b.property_b) ? false : class_b.property_b.contains("p-os:/a:redhat:enterprise_linux:")))) ? "Linux Team" : (((!has(class_a.property_a) ? false : ("Linux" == class_a.property_a)) && ((!has(class_b.property_b) ? false : class_b.property_b.contains("os:/o:novell:suse_linux:")) || (!has(class_b.property_b) ? false : class_b.property_b.contains("x-os:/o:novell:suse_linux:")) || (!has(class_b.property_b) ? false : class_b.property_b.contains("os:/a:novell:suse_linux:")) || (!has(class_b.property_b) ? false : class_b.property_b.contains("x-os:/a:novell:suse_linux:")) || (!has(class_b.property_b) ? false : class_b.property_b.contains("p-os:/a:novell:suse_linux:")))) ? "Linux Team" : (((!has(class_a.property_a) ? false : ("Linux" == class_a.property_a)) && ((!has(class_b.property_b) ? false : class_b.property_b.contains("os:/o:canonical:ubuntu_linux:")) || (!has(class_b.property_b) ? false : class_b.property_b.contains("x-os:/o:canonical:ubuntu_linux:")) || (!has(class_b.property_b) ? false : class_b.property_b.contains("os:/a:canonical:ubuntu_linux:")) || (!has(class_b.property_b) ? false : class_b.property_b.contains("x-os:/a:canonical:ubuntu_linux:")) || (!has(class_b.property_b) ? false : class_b.property_b.contains("p-os:/a:canonical:ubuntu_linux:")))) ? "Linux Team" : (((!has(class_a.property_a) ? false : ("Linux" == class_a.property_a)) && ((!has(class_b.property_b) ? false : class_b.property_b.contains("os:/h:")) || (!has(class_b.property_b) ? false : class_b.property_b.contains("x-os:/h:")))) ? "Linux Team" : (((!has(class_a.property_a) ? false : ("Windows Server" == class_a.property_a)) && ((!has(class_b.property_b) ? false : class_b.property_b.contains("os:/o:microsoft:")) || (!has(class_b.property_b) ? false : class_b.property_b.contains("x-os:/o:microsoft:")) || (!has(class_b.property_b) ? false : class_b.property_b.contains("os:/a:microsoft:")) || (!has(class_b.property_b) ? false : class_b.property_b.contains("x-os:/a:microsoft:")))) ? "Windows Team" : (((!has(class_a.property_a) ? false : ("Windows Server" == class_a.property_a)) && ((!has(class_b.property_b) ? false : class_b.property_b.contains("os:/h:")) || (!has(class_b.property_b) ? false : class_b.property_b.contains("x-os:/h:")))) ? "Windows Team" : (((!has(class_a.property_a) ? false : ("Linux" == class_a.property_a)) || (!has(class_a.property_a) ? false : ("Windows Server" == class_a.property_a))) ? "Unassigned" : (((!has(class_a.property_a) ? false : ("Windows Workstation" == class_a.property_a)) && ((!has(class_b.property_b) ? false : class_b.property_b.contains("os:/o:microsoft:")) || (!has(class_b.property_b) ? false : class_b.property_b.contains("x-os:/o:microsoft:")) || (!has(class_b.property_b) ? false : class_b.property_b.contains("os:/a:microsoft:")) || (!has(class_b.property_b) ? false : class_b.property_b.contains("x-os:/a:microsoft:")))) ? "Solutions Team" : ((!has(class_a.property_a) ? false : ("Windows Workstation" == class_a.property_a)) ? "End User" : "unknown"))))))))))))
"""

functions = {
    "of": lambda optional, value: value,
    "none": lambda optional, : None
}

env = celpy.Environment()

before_compile = time.time()
ast = env.compile(CEL_EXPRESSION_ORIGINAL_NO_OPTIONAL)
print(f'after compile: {time.time() - before_compile}')

before_runner = time.time()
program = env.program(ast,functions=functions)
print(f'after runner: {time.time() - before_runner}')

before_eval = time.time()

def process():
    for i in range(0, 100):
        before_json_to_cel = time.time()
        cel_context = {
                        "class_a": celpy.json_to_cel({"property_a":"something"}),
                        "class_b": celpy.json_to_cel({"title":"something else","property_b":"some var","integration_info":{"type":"GitHub"}}),
                        "optional": celpy.json_to_cel({})
                    }
        after_row_pre_process = time.time() - before_json_to_cel

        before_row_eval = time.time()
        result = program.evaluate(cel_context)
        after_row_eval = time.time()
        print(f"""after eval row {i}, 
              row time: {(after_row_eval - before_json_to_cel):.3f}, 
              pre-process time: {after_row_pre_process:.3f},
              eval time: {(after_row_eval - before_row_eval):.3f}, 
              current total time: {((after_row_eval - before_eval)/60):.3f} minutes, 
              result: {result}""")

# pr = cProfile.Profile()
# pr.enable()
# print("started profiler")
process()
# print("disabling profiler")
# pr.disable()

# s = io.StringIO()
# ps = pstats.Stats(pr, stream=s).sort_stats(pstats.SortKey.TIME)
# ps.print_stats()
# print(s.getvalue())
cromwellian commented 1 week ago

It's because of excess stringification caused by the @trace and logging everywhere, even if logging is disabled. It needs to be patched, but for now, try

    NameContainer.__repr__ = lambda self: ""
    Referent.__repr__ = lambda self: ""
    Activation.__repr__ = lambda self: ""
    MapType.__repr__ = lambda self: ""
    Tree.__repr__ = lambda self: ""
NoamSherr commented 1 week ago

@cromwellian thanks for the quick response! - this indeed helped; I'm getting now 10k iterations at around 1.5minutes (10 ms per row) which is a great improvement. anything additional I can do to increase it x10 :) ? As this is still slower then cel-java; ideally I need it to run in 1ms per iteration - you think possible or are we blocked at python's limitations?

cromwellian commented 1 week ago

You could implement a compiler. It may or may not speed it up depending on how you do it. Here is a very simple approach that may or may not work: Use Lambdas

Consider:

tree = ('*', ('+', 2, 3), ('-', 5, 2))
ops = {
  '+': lambda a, b: a + b,
  '-': lambda a, b: a - b,
  '*': lambda a, b: a * b
}

def compile(node):
    if not isinstance(node, tuple):
        return lambda: node
    else:
        op, left, right = node
        return lambda: ops[op](compile(left)(), compile(right)())

compiled = compile(tree)
print(compiled())

You could implement a lark.Transformer to do this and turn the entire CEL Ast into chained lambdas. It may or may not give a performance boost over generating python strings that 'eval'. I'd work on it myself but I don't have time recently.

cromwellian commented 1 week ago

BTW, this little benchmark shows a 1.5x speedup by using lambda compilation

from datetime import datetime
from random import random

ops = {
    '+': lambda x, y: x + y,
    '-': lambda x, y: x - y,
    '*': lambda x, y: x * y,
    '/': lambda x, y: x / y if y != 0 else 1
}
ops_list = list(ops.keys())

def make_ast(depth):
    if depth == 0:
        return depth + 1
    return ops_list[int(random() * len(ops_list))], make_ast(depth - 1), make_ast(depth - 1)

def eval_ast(ast):
    if type(ast) != tuple:
        return ast
    return ops[ast[0]](eval_ast(ast[1]), eval_ast(ast[2]))

def compile(ast):
    if type(ast) != tuple:
        return lambda: ast
    left, right = compile(ast[1]), compile(ast[2])
    return lambda: ops[ast[0]](left(), right())

loops = 1000000
x = make_ast(5)
start = datetime.now()
for i in range(loops):
    eval_ast(x)
print(f"Interpreted {datetime.now() - start}")
c = compile(x)
start = datetime.now()
for i in range(loops):
    c()
print(f"Compiled {datetime.now() - start}")
cromwellian commented 6 days ago

BTW, if you want to play around, AI can do some amazing things. I asked Sonnet 3.5 to implement the compiler suggested above and it did it on first try (but not using a lark AST), I bet you could get it to read the cel-python AST source and implement the compiler with enough coaxing, see here:

https://x.com/cromwellian/status/1804455807968518425