serge-sans-paille / pythran

Ahead of Time compiler for numeric kernels
https://pythran.readthedocs.io
BSD 3-Clause "New" or "Revised" License
1.98k stars 191 forks source link

Regression and crash #2209

Open paugier opened 1 month ago

paugier commented 1 month ago

I think this code was compiling fine with an old Pythran version (maybe few years ago). Now it crashes.


def get_counter_value(counter, key):
    try:
        return counter[key]
    except KeyError:
        return 0

def is_valid_distance_matches_1ray(
    candidate, closest_point, approved_matches_ray, min_distance2_matches_1ray
):
    x0, y0, z0 = closest_point
    for _, ray_id in candidate:
        if ray_id in approved_matches_ray:
            other_closest_points = approved_matches_ray[ray_id]
            for x1, y1, z1 in other_closest_points:
                # fmt: off
                distance2 = (x1 - x0) ** 2 + (y1 - y0) ** 2 + (z1 - z0) ** 2
                # fmt: on
                if distance2 < min_distance2_matches_1ray:
                    return False
    return True

def increment_counter(counter, key):
    try:
        counter[key] += 1
    except KeyError:
        counter[key] = 1

# pythran export kernel_make_approved_matches__min_distance_matches_1ray((int32, int32) list list, float64[:, :], float64[:], int64[:], float64, int, float64)

def kernel_make_approved_matches__min_distance_matches_1ray(
    candidates,
    closest_points,
    distances,
    indices_better_candidates,
    max_distance,
    max_matches_per_ray,
    min_distance_matches_1ray,
):
    approved_matches = []
    approved_closest_points = []
    approved_distances = []
    # match_counter = dict()
    id_pair = (0, 0)
    match_counter = {id_pair: 0}
    approved_matches_ray = {}
    removed_because_sphere = 0
    min_distance2_matches_1ray = min_distance_matches_1ray**2
    for index_candidate in indices_better_candidates:
        distance = distances[index_candidate]
        if distance >= max_distance:
            continue
        candidate = candidates[index_candidate]
        valid = True
        for id_pair in candidate:
            nb_matches = get_counter_value(match_counter, id_pair)
            if nb_matches >= max_matches_per_ray:
                valid = False
                break
        closest_point = closest_points[index_candidate]
        if not is_valid_distance_matches_1ray(
            candidate,
            closest_point,
            approved_matches_ray,
            min_distance2_matches_1ray,
        ):
            valid = False
            removed_because_sphere += 1
        if valid:
            for id_pair in candidate:
                increment_counter(match_counter, id_pair)
                ray_id = id_pair[1]
                approved_matches_ray.setdefault(ray_id, [])
                other_closest_points = approved_matches_ray[ray_id]
                other_closest_points.append(closest_point)
            approved_matches.append(candidate)
            approved_closest_points.append(closest_point)
            approved_distances.append(distance)
    return (
        approved_matches,
        approved_closest_points,
        approved_distances,
        removed_because_sphere,
    )

The C++ error contains

error: type '(anonymous namespace)::pythonic::types::empty_dict' does not provide a subscript operator

Is it useful to produce a simpler code to reproduce?

paugier commented 1 month ago

I was able to recompile this code with Python 3.9 and

pypi-timemachine 2022-01-01
pip install  --index-url http://localhost:54455 pythran numpy

This installs beniget-0.4.1 gast-0.5.3 numpy-1.22.0 ply-3.11 pythran-0.11.0 and with these versions, it works.

The crash was already obtained in 2023-01-01 (beniget-0.4.1 gast-0.5.3 numpy-1.24.1 ply-3.11 pythran-0.12.0).

serge-sans-paille commented 1 month ago

I can also reproduce. the problem is non trivial, I won't delay the release for it :-/

paugier commented 2 weeks ago

Hi @serge-sans-paille,

We have a project for which we use again this old code. For now, we use it with an old environment (2022-01-01) and it works, but it would be much nicer/simpler for us if this code was compatible with standard 2024 environments.

So my questions: do you think this is fixable in Pythran master and next Python release will support that? Or it is now too difficult and I should "fix" our code to overcome this Pythran issue? It is just to know if I should wait with this situation or find another solution to be able to use standard environments.

serge-sans-paille commented 2 weeks ago

I'll give it another try this week.