openvinotoolkit / openvino

OpenVINO™ is an open-source toolkit for optimizing and deploying AI inference
https://docs.openvino.ai
Apache License 2.0
6.33k stars 2.08k forks source link

[Good First Issue][Python API]: Create constant for string tensor and fix segfault #23611

Open rkazants opened 3 months ago

rkazants commented 3 months ago

Context

Currently, we receive segfault during creation of string type constant:

import openvino.runtime.opset14 as ov
import numpy as np
str_const = ov.constant(np.array(['openvino'], dtype=str))

For reproducing, you can install latest nightly build:

pip install openvino-nightly

What needs to be done?

Implement this feature

Example Pull Requests

No response

Resources

Contact points

Ticket

No response

dante-hl commented 3 months ago

Was wondering if I could take this towards the GSoC prerequisite.

rkazants commented 3 months ago

Was wondering if I could take this towards the GSoC prerequisite.

Hi @dante-hl, ask this question in discussions in a thread dedicated to project idea. What project idea are you going to apply to?

Best regards, Roman

dante-hl commented 3 months ago

Oh, sorry about that - I was thinking of applying to one of the two: Project 10: Accelerating PyTorch Lightning and ComfyUI with torch.compile OpenVINO Project 11: PyTorch Model Optimizations with torch.compile OpenVINO Backend

Regardless, I'll make a discussion post specifically for this right away - sorry about the inconvenience!

awayzjj commented 3 months ago

.take

github-actions[bot] commented 3 months ago

Thank you for looking into this issue! Please let us know if you have any questions or require any help.

awayzjj commented 3 months ago

@rkazants Hi, I have several questions.

  1. Which branch corresponds to the openvino-nightly build? When I tested using the test script you provided on the latest master branch, I encountered a MemoryError: std::bad_alloc error. Should I make modifications directly on the master branch? I have already attempted some modifications based on the master branch.

2.The error traceback is as follows: https://github.com/openvinotoolkit/openvino/blob/c3c409ee133ffb26bf8fd5570ef50a7c004839a4/src/bindings/python/src/pyopenvino/pyopenvino.cpp#L234 https://github.com/openvinotoolkit/openvino/blob/c3c409ee133ffb26bf8fd5570ef50a7c004839a4/src/bindings/python/src/pyopenvino/graph/ops/constant.cpp#L104-L114 https://github.com/openvinotoolkit/openvino/blob/c3c409ee133ffb26bf8fd5570ef50a7c004839a4/src/bindings/python/src/pyopenvino/core/common.hpp#L100-L105 https://github.com/openvinotoolkit/openvino/blob/c3c409ee133ffb26bf8fd5570ef50a7c004839a4/src/bindings/python/src/pyopenvino/core/common.cpp#L286-L301 https://github.com/openvinotoolkit/openvino/blob/c3c409ee133ffb26bf8fd5570ef50a7c004839a4/src/core/src/op/constant.cpp#L131-L140 The line std::uninitialized_copy_n(src_strings, num_elements, dst_strings); ultimately caused the error.

3.As shown in the code snippet below, I have added some print statements and attempted different approaches:

Constant::Constant(const element::Type& type, const Shape& shape, const void* data) : Constant(false, type, shape) {
    if (m_element_type == ov::element::string) {
        auto num_elements = shape_size(m_shape); // 1
        const std::string* src_strings = static_cast<const std::string*>(data);
        const char* src_ptr = reinterpret_cast<const char*>(src_strings);
        printf("src_string: %c|%c|%c|%c|%c\n", *src_ptr, *(src_ptr+1), *(src_ptr+2), *(src_ptr+3), *(src_ptr+4));

        std::string* dst_strings = static_cast<std::string*>(get_data_ptr_nc());
        char* dst_ptr = reinterpret_cast<char*>(dst_strings);
        printf("dst_string: %c|%c|%c|%c|%c\n", *dst_ptr, *(dst_ptr+1), *(dst_ptr+2), *(dst_ptr+3), *(dst_ptr+4));

        std::uninitialized_copy_n(src_strings, num_elements, dst_strings); // case 0: MemoryError: std::bad_alloc
        //std::uninitialized_fill_n(dst_strings, num_elements, std::string()); // case 1: no core, reached copy ctor
        //std::memcpy(get_data_ptr_nc(), data, mem_size()); // case 2: copied src_string to dst_string, reached copy ctor
        *dst_ptr = 'x'; // case 3: 
        printf("modified dst_string: %c|%c|%c|%c|%c\n", *dst_ptr, *(dst_ptr+1), *(dst_ptr+2), *(dst_ptr+3), *(dst_ptr+4));
    } else {
        std::memcpy(get_data_ptr_nc(), data, mem_size());
    }
}

Constant::Constant(const Constant& other)
    : m_element_type{other.m_element_type},
      m_shape{other.m_shape},
      m_data{other.m_data},
      m_all_elements_bitwise_identical{other.m_all_elements_bitwise_identical.load()},
      m_all_elements_bitwise_identical_checked{other.m_all_elements_bitwise_identical_checked.load()} {
    std::cout << "in copy ctor" << std::endl;
    constructor_validate_and_infer_types();
}

case 0: No modification, error occurs

src_string: o||||p
dst_string: ;||Z||
Traceback (most recent call last):
  File "/home/ubuntu/dev/openvino/build/bug.py", line 8, in <module>
    str_const = ov.constant(np.array(['openvion'], dtype=str))
  File "/home/ubuntu/.local/lib/python3.10/site-packages/openvino/runtime/utils/decorators.py", line 24, in wrapper
    node = node_factory_function(*args, **kwargs)
  File "/home/ubuntu/.local/lib/python3.10/site-packages/openvino/runtime/opset13/ops.py", line 337, in constant
    return Constant(_value, shared_memory=_shared_memory)
MemoryError: std::bad_alloc

case 1: Initialization is done, exit normally.

src_string: o||||p
dst_string: ]||||
modified dst_string: ||D||
in copy ctor

case 2: Similar to the else branch, src_string copy is successful, but segfault occurs later

src_string: o||||p
dst_string: U|S|||*
modified dst_string: o||||p
in copy ctor
Segmentation fault (core dumped)

case 3: Direct modification first character of dst_string, segfault occurs

src_string: o||||p
dst_string: |||
modified dst_string: x|||
in copy ctor
Segmentation fault (core dumped)

Could you please provide me with some debugging tips or any other suggestions? Thank you.

p-wysocki commented 3 months ago

cc @rkazants

awayzjj commented 2 months ago

@rkazants Hi, could you please provide me with some debugging tips or any other suggestions? :)

amkarn258 commented 2 days ago

.take

github-actions[bot] commented 2 days ago

Thank you for looking into this issue! Please let us know if you have any questions or require any help.