microsoft / pylance-release

Documentation and issues for Pylance
Creative Commons Attribution 4.0 International
1.7k stars 767 forks source link

pymc.math: unknown attribute #6406

Closed ibobak closed 4 weeks ago

ibobak commented 4 weeks ago

Type: Bug

I am using pymc library - a well known, famous and respectful one. Here is the problem:

image

Environment file (use conda to recreate this environment): e.yml.txt

Code to reproduce:

import pymc as pm
import numpy as np

def estimate_mu_sigma(data):
    log_data = np.log(data)  # Take the natural logarithm of the data
    mu_estimate = np.mean(log_data)  # estimate mu (mean of log-transformed data)
    sigma_estimate = np.std(log_data, ddof=0)  # Estimate sigma (standard deviation of log-transformed data), ddof=0 for population std
    return mu_estimate, sigma_estimate

def get_values(a_pdf, a_column, a_quantile: float = 0.0) -> np.array:
    v = a_pdf[a_column].values
    if a_quantile > 0:
        v_len = len(v)
        q_value = np.quantile(v, a_quantile)
        print(f"v0 max={np.max(v)}, quantile {a_quantile}={q_value}")
        v = a_pdf[a_pdf[a_column]<q_value][a_column].values
        print(f"Previous length={v_len}, current length={len(v)}")
    return v

def train_pymc(a_pdf: pd.DataFrame, a_ad_type: str, a_cohort_day0: int, a_cohort_day1: int):
    pdf_d0: pd.DataFrame = a_pdf[(a_pdf["v_ad_type"]==a_ad_type) & (a_pdf["cohort_day"]==a_cohort_day0)]
    pdf_d1: pd.DataFrame = a_pdf[(a_pdf["v_ad_type"]==a_ad_type) & (a_pdf["cohort_day"]==a_cohort_day1)]

    v0 = get_values(pdf_d0, "v_cpm", 0.97)
    est_mu_0, est_sigma_0 = estimate_mu_sigma(v0)
    print(f"Estimated est_mu_0={est_mu_0}, est_sigma_0={est_sigma_0}")

    v1 = get_values(pdf_d1, "v_cpm", 0.97)
    est_mu_1, est_sigma_1 = estimate_mu_sigma(v1)
    print(f"Estimated est_mu_1={est_mu_1}, est_sigma_1={est_sigma_1}")

    with pm.Model():
        mu_0 = pm.Uniform("mu_0", 0.25*est_mu_0, 2*est_mu_0)                 # TODO: too broad ranges, try to minimize by confidence intervasl
        sigma_0 = pm.Uniform("sigma_0", 0.25*est_sigma_0, 2*est_sigma_0)
        pm.LogNormal('x0', mu=mu_0, sigma=sigma_0, observed=v0)

        mu_1 = pm.Uniform("mu_1", 0.25*est_mu_1, 2*est_mu_1)                 # TODO: too broad ranges, try to minimize by confidence intervasl
        sigma_1 = pm.Uniform("sigma_1", 0.25*est_sigma_1, 2*est_sigma_1)
        pm.LogNormal('x1', mu=mu_1, sigma=sigma_1, observed=v1)

        pm.Deterministic("delta_mean", pm.math.exp(mu_1 + sigma_1**2 / 2) - pm.math.exp(mu_0 + sigma_0**2 / 2))

        # To be explained in chapter 3.
        step = pm.NUTS()
        trace = pm.sample(20000, tune=1000, step=step, chains=4)

    return trace

trace = train_pymc(pdf, "REWARDED", 0, 1)

Extension version: 2024.9.1 VS Code version: Code 1.93.1 (38c31bc77e0dd6ae88a4e9cc93428cc27a56ba40, 2024-09-11T17:20:05.685Z) OS version: Linux x64 6.5.0-45-generic Modes:

System Info |Item|Value| |---|---| |CPUs|Intel(R) Xeon(R) CPU E5-2696 v4 @ 2.20GHz (88 x 1359)| |GPU Status|2d_canvas: enabled
canvas_oop_rasterization: enabled_on
direct_rendering_display_compositor: disabled_off_ok
gpu_compositing: enabled
multiple_raster_threads: enabled_on
opengl: enabled_on
rasterization: enabled
raw_draw: disabled_off_ok
skia_graphite: disabled_off
video_decode: enabled
video_encode: disabled_software
vulkan: disabled_off
webgl: enabled
webgl2: enabled
webgpu: disabled_off
webnn: disabled_off| |Load (avg)|1, 2, 2| |Memory (System)|251.76GB (222.50GB free)| |Process Argv|--crash-reporter-id 27d42247-63fb-4d9e-9cb0-87d9974843dc| |Screen Reader|no| |VM|0%| |DESKTOP_SESSION|ubuntu-xorg| |XDG_CURRENT_DESKTOP|Unity| |XDG_SESSION_DESKTOP|ubuntu-xorg| |XDG_SESSION_TYPE|x11|
A/B Experiments ``` vsliv368:30146709 vspor879:30202332 vspor708:30202333 vspor363:30204092 vscod805:30301674 binariesv615:30325510 vsaa593:30376534 py29gd2263:31024239 c4g48928:30535728 azure-dev_surveyone:30548225 2i9eh265:30646982 962ge761:30959799 pythongtdpath:30769146 welcomedialog:30910333 pythonnoceb:30805159 asynctok:30898717 pythonmypyd1:30879173 h48ei257:31000450 pythontbext0:30879054 accentitlementst:30995554 dsvsc016:30899300 dsvsc017:30899301 dsvsc018:30899302 cppperfnew:31000557 dsvsc020:30976470 pythonait:31006305 dsvsc021:30996838 9c06g630:31013171 a69g1124:31058053 dvdeprecation:31068756 dwnewjupyter:31046869 2f103344:31071589 impr_priority:31102340 nativerepl2:31139839 refactort:31108082 pythonrstrctxt:31112756 flightc:31134773 wkspc-onlycs-t:31132770 wkspc-ranged-t:31125599 fje88620:31121564 ```
rchiodo commented 4 weeks ago

Thanks for the issue. This is occurring because you're not importing pymc.math. You have to import that module for this to work. The fact that it runs is just a side effect of what the pymc module is doing. (It imports pymc.math itself)

See this issue for more information (similar situation) https://github.com/microsoft/pylance-release/issues/4326

And this documentation: https://microsoft.github.io/pyright/#/import-statements

ibobak commented 3 weeks ago

This is not all.

import pymc
import pymc.math

I did this, and here is the code:

with pymc.Model() as model:
        mu_0 = pymc.Uniform("mu_0", d0["mu_ci_lower_0"], d0["mu_ci_upper_0"],
                            initval=d0["mu_estimate_0"])
        sigma_0 = pymc.Uniform("sigma_0", d0["sigma_ci_lower_0"], d0["sigma_ci_upper_0"],
                               initval=d0["sigma_estimate_0"])
        pymc.LogNormal('x0', mu=mu_0, sigma=sigma_0, observed=v0)

        mu_1 = pymc.Uniform("mu_1", d1["mu_ci_lower_1"], d1["mu_ci_upper_1"],
                            initval=d1["mu_estimate_1"])
        sigma_1 = pymc.Uniform("sigma_1", d1["sigma_ci_lower_1"], d1["sigma_ci_upper_1"],
                               initval=d1["sigma_estimate_1"])
        pymc.LogNormal('x1', mu=mu_1, sigma=sigma_1, observed=v1)

        pymc.Deterministic("delta_mean", pymc.math.exp(mu_0 + sigma_0**2 / 2) - pymc.math.exp(mu_1 + sigma_1**2 / 2))
        pymc.Deterministic("delta_mean_percent",
                           100 * (pymc.math.exp(mu_0 + sigma_0**2 / 2)
                            - pymc.math.exp(mu_1 + sigma_1**2 / 2)) / pymc.math.exp(mu_1 + sigma_1**2 / 2))

And this is what I am getting: image

ibobak commented 3 weeks ago

@rchiodo could you please re-open and look at this error?

rchiodo commented 3 weeks ago

That would likely be because pymc.math.exp is not returning anything.

Yeah the definition looks like this:

@scalar_elemwise
def exp(a):
    """e^`a`""" 

There's actually no code there.

I'm guessing pymc is a python wrapper around some C code? It would need to specify return types for this to work correctly.

Or you can turn off type checking in your code. That error won't show up if 'typeCheckingMode' is off, or if you put # type: ignore on the line with the error.