shogun-toolbox / shogun

Shōgun
http://shogun-toolbox.org
BSD 3-Clause "New" or "Revised" License
3.03k stars 1.04k forks source link

CCARTree serialization/deserialization error #3481

Open tt83 opened 7 years ago

tt83 commented 7 years ago

I'm trying to serialize/deserialize a CCARTree object in C++, but when I call load_serializable I'm getting an error: In file ~/Library/shogun-develop/src/shogun/base/Parameter.cpp line 2212: TParameter::new_sgserial(): Class CBinaryTreeMachineNode' was not listed during compiling Shogun :( ... Can not construct it form_root'! Fragment of code:

        {
            auto tree = new shogun::CCARTree(ft, shogun::PT_REGRESSION);
            tree->set_labels(std::get<1>(dataset));
            tree->set_weights(std::get<2>(dataset));
            tree->train(std::get<0>(dataset));

            auto file = new shogun::CSerializableAsciiFile("test.dat", 'w');
            if (!tree->save_serializable(file))
            {
                std::cerr << "Error saving file" << std::endl;
            }
            file->close();
            SG_UNREF(file);
            SG_UNREF(tree);
        }

        auto tree = new shogun::CCARTree();
        auto file = new shogun::CSerializableAsciiFile("test.dat", 'r');
        if (!tree->load_serializable(file))
        {
            std::cerr << "Error loading file" << std::endl;
            file->close();
            SG_UNREF(file);
            SG_UNREF(tree);
            return -1;
        }
        file->close();
        SG_UNREF(file);
karlnapf commented 7 years ago

Thanks for reporting this I will investigate and get back to you

karlnapf commented 7 years ago

3537

zrlkau commented 6 years ago

Hi @karlnapf,

I ran into the same issue with 6.0.0 (shogun-6.0.0-4.fc26) and with the latest git code (d8c84e90df99757aef7ebd43ade332dad5bbc8e7) while I was trying to load a CARTree in C++ that I trained and serialized in Python. Do you have an outlook on when this might be fixed (or a workaround for now, since I'm working towards a looming paper submission deadline).

thanks, Michael

karlnapf commented 6 years ago

Ah yes this one. I’ll try to have a look later today. Pls ping again next week if nothing happened. This is definitely something that needs fixing asap

zrlkau commented 6 years ago

Thanks for looking into it @karlnapf. I dug into it a little bit myself and I found that one problem might be that in src/shogun/multiclass/tree/BinaryTreeMachineNode.h the class definition is split across multiple lines, i.e.

template <typename T>
class CBinaryTreeMachineNode
       : public CTreeMachineNode<T>

instead of

template <typename T> class CBinaryTreeMachineNode : public CTreeMachineNode<T>

If I change that, then src/shogun/base/class_list.cpp.py picks it up (but unfortunately, twice, as simple and as template class - I manually removed it as simple class.) and adds the necessary functions for it into class_list.cpp. Unfortunately, I still get the same error (which is somewhat confusing to me, because I found _ZL28__new_CBinaryTreeMachineNodeN6shogun14EPrimitiveTypeE in libshogun.so):

[...]/shogun/src/shogun/base/Parameter.cpp line 2210: TParameter::new_sgserial(): Class CBinaryTreeMachineNode was not listed during compiling Shogun :( ... Can not construct it for m_root![GCDEBUG] unref() refcount 4294967295, obj SerializableAsciiFile (0x7f24b4008f50) destroying

Update: It's because the object it tries to deserialize is a CBinaryTreeMachineNode which it can't because the corresponding 'case' statement is empty.

karlnapf commented 6 years ago

Just briefly looked into this. Unfortunately there is some deeper issues causing this. I give it a shot fixing later.

If it takes too much time: this issue will be resolved once we have the new Parameter/serialisation framework in place which is high priority it also a lot of work.

On Thu, 12 Apr 2018 at 12:22, zrlkau notifications@github.com wrote:

Thanks for looking into it @karlnapf https://github.com/karlnapf. I dug into it a little bit myself and I found that one problem might be that in src/shogun/multiclass/tree/BinaryTreeMachineNode.h the class definition is split across multiple lines, i.e.

template class CBinaryTreeMachineNode : public CTreeMachineNode

instead of

template class CBinaryTreeMachineNode : public CTreeMachineNode

If I change that, then src/shogun/base/class_list.cpp.py picks it up (but unfortunately, twice, as simple and as template class - I manually removed it as simple class.) and adds the necessary functions for it into class_list.cpp. Unfortunately, I still get the same error (which is somewhat confusing to me, because I found _ZL28__new_CBinaryTreeMachineNodeN6shogun14EPrimitiveTypeE in libshogun.so):

[...]/shogun/src/shogun/base/Parameter.cpp line 2210: TParameter::new_sgserial(): Class CBinaryTreeMachineNode was not listed during compiling Shogun :( ... Can not construct it for m_root![GCDEBUG] unref() refcount 4294967295, obj SerializableAsciiFile (0x7f24b4008f50) destroying

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/shogun-toolbox/shogun/issues/3481#issuecomment-380771297, or mute the thread https://github.com/notifications/unsubscribe-auth/AAqqv_2WXaPh61d-hni8CBfiS_VG2TYHks5tnzjwgaJpZM4KjWZO .

-- Sent from my phone

zrlkau commented 6 years ago

Hi @karlnapf ,

thanks for looking into it. I was afraid that it's something more fundamental since that case isn't covered in any templated class. If you can find a fix, it'd really appreciate it but I understand that it's not a simple problem.

thanks, Michael

karlnapf commented 6 years ago

The reason is not that this case is not covered in templated classes (or at least that is easy to fix), but the problem is that the tree implementations were never written in a way that they can be serialized (which is our bad for having not enforced that back when the GSoC student wrote this).

I opened an issue that suggests a workaround #4242

karlnapf commented 6 years ago

We might pick this up at this weekend's hackathon, or maybe delegate it to a GSoC student, as it is a cool task to learn more about shogun's internals.