ARM-software / armnn

Arm NN ML Software. The code here is a read-only mirror of https://review.mlplatform.org/admin/repos/ml/armnn
https://developer.arm.com/products/processors/machine-learning/arm-nn
MIT License
1.15k stars 307 forks source link

Segmentation fault (core dumped) while getting the node attributes from the TF Graphdef #582

Closed Darshvino closed 2 years ago

Darshvino commented 2 years ago

Hi armnn team,

Thanks for your amazing work.

I was trying to use the TF Parser code (https://github.com/ARM-software/armnn/blob/branches/armnn_20_08/src/armnnTfParser/TfParser.cpp) for deserializing the Frozen graph.

    FILE* fd = fopen("model.pb", "rb");
    if (fd == nullptr)
    {
        std::cout<<"*******there is no graph file found**********"<<std::endl;

    }
    GraphDef graphDef;

     std::cout<<"right after ReadBinaryProto"<<std::endl;
    google::protobuf::io::FileInputStream  inStream(fileno(fd));
    google::protobuf::io::CodedInputStream codedStream(&inStream);
    codedStream.SetTotalBytesLimit(INT_MAX, INT_MAX);
    bool success = graphDef.ParseFromCodedStream(&codedStream); 
    fclose(fd);

The above code works correctly, but while I am trying to get the node attributes from the Graphdef, I am getting Segmentation fault (core dumped).

    for (int i = 0; i < graphDef.node_size(); ++i)
    {

            const NodeDef& node = graphDef.node(i);
            DataType type = tensorflow::DT_FLOAT;
            //auto node = graphDef.node(i);
            auto attr = node.attr().at("T");
            type = attr.type();
}

I am not sure whether it is a Model issue or a Deserialisation code issue. It would be of great help if anyone can help me to understand and resolve the issue.

Thanks and Regards

Colm-in-Arm commented 2 years ago

HI @Darshvino,

Support for TensorFlow was removed in 21.05 but I see you are back in 20.08.

"graphDef" has at least been partially populated in your code as graphDef.node_size() > 0. Is the segmentation fault happening when you access "node" or "attr"?

Colm.

Darshvino commented 2 years ago

Hi @Colm-in-Arm,

Thanks for your reply. The segmentation fault is happening while I access "attr".

Not Sure about the exact issue. I had also tried with TF 21.05.

Looking forward to your reply.

Thanks, Darshan C G

Colm-in-Arm commented 2 years ago

Hi Darshan C G,

If I look at the Tf documentation: this code "auto attr = node.attr().at("T");" should result in either attr being set or an std::out_of_range exception. However, I wonder if the de-serialization is getting in the way a bit. Have you tried examining or iterating the node.attr() map to see if there are any attributes there at all?

Colm.

Darshvino commented 2 years ago

Hi @Darshvino,

Thanks much for your kind reply!.

I am able to get the node names correctly.

    for (int i =1; i < graphDef.node_size(); ++i)
    {
      std::cout<<"node _name==>"<<node.name().c_str()<<std::endl;
  }

What I am trying here is to get the attribute values and the weights from all the nodes in the graphdef. Whenever I am calling the node.attr() I am getting the segmentation fault.

For example:

    const tensorflow::NodeDef node = graphDef.node(2);
        auto attr_2 = node.attr();

at line 2 the segmentation fault occurs.

And I am attaching the trace back below;

0x00007ffff5f372ea in google::protobuf::hash<char const*>::operator() (this=0x7fffffffcea7, str=0x5de58948f7894855 <error: Cannot access memory at address 0x5de58948f7894855>)
    at /home/darshan/project/sw/external/protobuf/src/google/protobuf/stubs/hash.h:66
66      for (; *str != '\0'; str++) {
(gdb) bt
#0  0x00007ffff5f372ea in google::protobuf::hash<char const*>::operator() (this=0x7fffffffcea7, str=0x5de58948f7894855 <error: Cannot access memory at address 0x5de58948f7894855>)
    at /home/darshan/project/sw/external/protobuf/src/google/protobuf/src/google/protobuf/stubs/hash.h:66
#1  0x00007ffff5f3735b in google::protobuf::hash<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >::operator() (this=0x5555557c5450, 
    key=<error: Cannot access memory at address 0x5de58948f7894855>) at /home/darshan/project/sw/external/protobuf/src/google/protobuf/stubs/hash.h:83
#2  0x00007ffff5f44e49 in google::protobuf::Map<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, tensorflow::AttrValue>::InnerMap::BucketNumber (this=0x5555557c5450, 
    k=<error: Cannot access memory at address 0x5de58948f7894855>) at /home/darshan/project/sw/external/protobuf/src/google/protobuf/map.h:888
#3  0x00007ffff5f4321e in google::protobuf::Map<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, tensorflow::AttrValue>::InnerMap::FindHelper (this=0x5555557c5450, 
    k=<error: Cannot access memory at address 0x5de58948f7894855>, it=0x0) at /home/darshan/project/sw/external/protobuf/src/google/protobuf/map.h:647
#4  0x00007ffff5f40376 in google::protobuf::Map<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, tensorflow::AttrValue>::InnerMap::FindHelper (this=0x5555557c5450, 
    k=<error: Cannot access memory at address 0x5de58948f7894855>) at/home/darshan/project/sw/external/protobuf/src/google/protobuf/map.h:643
#5  0x00007ffff5f3c782 in google::protobuf::Map<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, tensorflow::AttrValue>::InnerMap::find (this=0x5555557c5450, 
    k=<error: Cannot access memory at address 0x5de58948f7894855>) at /home/darshan/project/sw/external/protobuf/src/google/protobuf/map.h:558
#6  0x00007ffff5f3c9d0 in google::protobuf::Map<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, tensorflow::AttrValue>::find (this=0x7fffffffd210, 
    key=<error: Cannot access memory at address 0x5de58948f7894855>) at /home/darshan/project/sw/external/protobuf/src/google/protobuf/map.h:1078
#7  0x00007ffff5f3a623 in google::protobuf::Map<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, tensorflow::AttrValue>::insert<google::protobuf::Map<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, tensorflow::AttrValue>::const_iterator> (this=0x7fffffffd210, first=..., last=...)
    at /home/darshan/project/sw/external/protobuf/src/google/protobuf/src/google/protobuf/map.h:1112
#8  0x00007ffff5f38a5a in google::protobuf::Map<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, tensorflow::AttrValue>::Map (this=0x7fffffffd210, other=...)
    at /home/darshan/project/sw/external/protobuf/src/google/protobuf/src/google/protobuf/map.h:149
#9  0x00007ffff5e1c7c9 in parse ()
    at tensorflow/TFParser.cpp:824
#10 0x0000555555563632 in parseTFNetwork() ()
#11 0x0000555555563e34 in parseAndCompile() ()
#12 0x000055555555e2cb in launchTest() ()
#13 0x000055555555f44d in main ()

I am not sure what exactly is the issue here. Whether it is a deserialization issue, or the way of accessing is wrong, or any other?

TF Version = 2.3 Protobuf Version = 3.9.2(compatible for TF version 2.3)

Can you please help me to resolve the issue? It would be a great help from your side.

Thanks and Regards.

Darshvino commented 2 years ago

Hi @Colm-in-Arm,

I am able to print the Graphdef content and all seems to be right. I am able to get the correct graph.

    for (int i = 0; i < graph_Def.node_size(); i++)
    {
            graph_Def.node(i).PrintDebugString();

    }

But not sure how do I get the attributes and the weights of each node from the graphDef?

james-conroy-arm commented 2 years ago

Hi @Darshvino ,

This StackOverflow question may help you with getting the weights from the GraphDef, you may need to refer to TensorFlow's documentation also. If you still have issues you should get more helpful information by opening an issue with TensorFlow.

Since this issue appears to be an environment issue and affects the TF Parser which is now removed from Arm NN in release 21.05, I think this ticket can be closed. Please feel free to re-open if this is not the case.

Thanks, James

james-conroy-arm commented 2 years ago

From Arm NN 21.05 release onwards, we recommend using our TF Lite parser instead of the removed TF Parser. Please refer to TensorFlow's documentation for help with converting TF models to TF Lite format: https://www.tensorflow.org/lite/convert