DarkMark crashes - Githubissues

hassoun123 commented 9 months ago

although the training process is finished, weights and cfg files are all good with correct paths, whenever i try to load after having the best weights and cfg files DarkMark crashes

stephanecharette commented 9 months ago

This is not enough details for me to determine what might have gone wrong. At the very least, you need to look at the log file and see where the crash happened, and the call stack that gets logged.

hassoun123 commented 9 months ago

The output from Valgrind indicates that there are memory allocation issues within the DarkMark application, specifically within functions that are part of the OpenCV library (libopencv_core.so) and the DarkHelp::NN class of DarkMark. The possibly lost memory bytes suggest that there are allocations that were not freed, which can be indicative of memory leaks. and the log indicates that DarkMark is experiencing a segmentation fault.

User 2023-12-23 19:19:57 finding all images and markup files in /home/hassan/YOLOv4/Paddle 2023-12-23 19:19:57 number of images found in /home/hassan/YOLOv4/Paddle: 3955 2023-12-23 19:19:57 loading darknet neural network 2023-12-23 19:19:57 attempting to load neural network /home/hassan/YOLOv4/Paddle/Paddle.cfg / /home/hassan/YOLOv4/Paddle/Paddle_best.weights / /home/hassan/YOLOv4/Paddle/Paddle.names 2023-12-23 19:19:57 neural network loaded in 166.778 milliseconds 2023-12-23 19:19:57 number of name entries: 1 2023-12-23 19:19:57 aborting due to signal: "Segmentation fault" [signal #11] 2023-12-23 19:19:57 backtrace #0: ./DarkMark: get_backtrace[abi:cxx11]() +0x4f [0x56505d617ccf] 2023-12-23 19:19:57 backtrace #1: ./DarkMark: dm::DarkMarkApplication::signal_handler(int) +0x1ca [0x56505d61908a] 2023-12-23 19:19:57 backtrace #2: /lib/x86_64-linux-gnu/libc.so.6(+0x42520) [0x7f353aeb1520] 2023-12-23 19:19:57 backtrace #3: /lib/x86_64-linux-gnu/libc.so.6(+0x1af84e) [0x7f353b01e84e] 2023-12-23 19:19:57 backtrace #4: /lib/libdarknet.so: im2col_cpu_ext +0x65e [0x7f353c2d57fe] 2023-12-23 19:19:57 backtrace #5: /lib/libdarknet.so: forward_convolutional_layer +0x165 [0x7f353c258b15] 2023-12-23 19:19:57 backtrace #6: /lib/libdarknet.so: forward_network +0x87 [0x7f353c2fa8f7] 2023-12-23 19:19:57 backtrace #7: /lib/libdarknet.so: network_predict +0x87 [0x7f353c2fc127] 2023-12-23 19:19:57 backtrace #8: ./DarkMark: DarkHelp::NN::predict_internal_darknet() +0x23f [0x56505da53acf] 2023-12-23 19:19:57 backtrace #9: ./DarkMark: DarkHelp::NN::predict_internal(cv::Mat, float) +0x2e2 [0x56505da590a2] 2023-12-23 19:19:57 backtrace #10: ./DarkMark: DarkHelp::NN::predict(cv::Mat, float) +0xa7 [0x56505da5b4d7] 2023-12-23 19:19:57 backtrace #11: ./DarkMark: dm::DMContent::load_image(unsigned long, bool, bool) +0x76b [0x56505d6a431b] 2023-12-23 19:19:57 backtrace #12: ./DarkMark: dm::DMContent::set_sort_order(dm::ESort) +0x171 [0x56505d6a4fb1] 2023-12-23 19:19:57 backtrace #13: ./DarkMark: dm::DMContent::start_darknet() +0x525 [0x56505d6a57c5] 2023-12-23 19:19:57 backtrace #14: ./DarkMark: dm::DMWnd::DMWnd(std::__cxx11::basic_string<char, std::char_traits, std::allocator > const&) +0x2de [0x56505d6dad5e] 2023-12-23 19:19:57 backtrace #15: ./DarkMark: dm::StartupWnd::buttonClicked(juce::Button) +0x1e03 [0x56505d6f9a33] 2023-12-23 19:19:57 backtrace #16: ./DarkMark: juce::Button::sendClickMessage(juce::ModifierKeys const&) +0x228 [0x56505d979fd8] 2023-12-23 19:19:57 backtrace #17: ./DarkMark: juce::Button::mouseUp(juce::MouseEvent const&) +0xe9 [0x56505d975439] 2023-12-23 19:19:57 backtrace #18: ./DarkMark: juce::Component::internalMouseUp(juce::MouseInputSource, juce::PointerState const&, juce::Time, juce::ModifierKeys) +0x1b7 [0x56505d975837] 2023-12-23 19:19:57 backtrace #19: ./DarkMark: juce::MouseInputSourceInternal::setButtons(juce::PointerState const&, juce::Time, juce::ModifierKeys) +0x119 [0x56505da01a79] 2023-12-23 19:19:57 backtrace #20: ./DarkMark: juce::MouseInputSource::handleEvent(juce::ComponentPeer&, juce::Point, long long, juce::ModifierKeys, float, float, juce::PenDetails const&) +0x2f0 [0x56505d978c40] 2023-12-23 19:19:57 backtrace #21: ./DarkMark: juce::XWindowSystem::handleButtonReleaseEvent(juce::LinuxComponentPeer, XButtonEvent const&) const +0x1bb [0x56505d9a380b] 2023-12-23 19:19:57 backtrace #22: ./DarkMark: juce::XWindowSystem::handleWindowMessage(juce::LinuxComponentPeer*, _XEvent&) const +0x2ad [0x56505d9a570d] 2023-12-23 19:19:57 backtrace #23: ./DarkMark(+0x59bbab) [0x56505d9a5bab] 2023-12-23 19:19:57 backtrace #24: ./DarkMark: juce::MessageManager::runDispatchLoop() +0x1a1 [0x56505d7e48c1] 2023-12-23 19:19:57 backtrace #25: ./DarkMark: juce::JUCEApplicationBase::main() +0x41 [0x56505d614461] 2023-12-23 19:19:57 backtrace #26: /lib/x86_64-linux-gnu/libc.so.6(+0x29d90) [0x7f353ae98d90] 2023-12-23 19:19:57 backtrace #27: /lib/x86_64-linux-gnu/libc.so.6: __libc_start_main +0x80 [0x7f353ae98e40] 2023-12-23 19:19:57 backtrace #28: ./DarkMark: _start +0x25 [0x56505d615fb5] Aborted (core dumped)

note that this happens only when i press the load button after creating the darknet files and training the model

stephanecharette commented 9 months ago

I and the other people on the Darknet/YOLO discord have zero crashes or problems with DarkMark. It is working well for us.

If you'd like me to dig into the problem you are seeing, then we'd need more details, or some files we can load to replicate the problem you are seeing. Without being able to replicate the problem, we're 100% reliant on your description to reproduce the issue.

Here is a screenshot I took just now of DarkMark v1.8.18-1 -- the latest version -- running on my rig where I'm training a new network today. So I'm 100% certain that it does work.

Just out of curiosity, which version of Darknet are you using? You should be using the latest from this repo: https://github.com/hank-ai/darknet#table-of-contents

hassoun123 commented 9 months ago

darknet version is v2.0-63-g315c57b6-dirty this was the one installed by default from the GitHub repository. available versions are v1.99 and v2.0

On Mon, Dec 25, 2023, 12:19 AM Stéphane Charette @.***> wrote:

I and the other people on the Darknet/YOLO discord have zero crashes or problems with DarkMark. It is working well for us.

If you'd like me to dig into the problem you are seeing, then we'd need more details, or some files we can load to replicate the problem you are seeing. Without being able to replicate the problem, we're 100% reliant on your description to reproduce the issue.

Here is a screenshot I took just now of DarkMark v1.8.18-1 -- the latest version -- running on my rig where I'm training a new network today. So I'm 100% certain that it does work. image.png (view on web) https://github.com/stephanecharette/DarkMark/assets/5061352/6e079a3c-95eb-4266-be13-ed41566dd167

Just out of curiosity, which version of Darknet are you using? You should be using the latest from this repo: https://github.com/hank-ai/darknet#table-of-contents

— Reply to this email directly, view it on GitHub https://github.com/stephanecharette/DarkMark/issues/30#issuecomment-1868602905, or unsubscribe https://github.com/notifications/unsubscribe-auth/AYPBDHJPZPNEC6XJVHUSHNDYLCS5RAVCNFSM6AAAAABBBC2IFSVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQNRYGYYDEOJQGU . You are receiving this because you authored the thread.Message ID: @.***>

stephanecharette commented 9 months ago

The "dirty" in your version string shows that you made local modifications to darknet. What changes did you make? Run "git status" and/or "git diff" to see what you modified.

hassoun123 commented 9 months ago

i didn't modify anything, but I'm using Linux mint, idk if that's an issue

On Mon, Dec 25, 2023, 4:57 PM Stéphane Charette @.***> wrote:

The "dirty" in your version string shows that you made local modifications to darknet. What changes did you make? Run "git status" and/or "git diff" to see what you modified.

— Reply to this email directly, view it on GitHub https://github.com/stephanecharette/DarkMark/issues/30#issuecomment-1869017321, or unsubscribe https://github.com/notifications/unsubscribe-auth/AYPBDHMPFVJQAIUZN2MOXU3YLGH6ZAVCNFSM6AAAAABBBC2IFSVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQNRZGAYTOMZSGE . You are receiving this because you authored the thread.Message ID: @.***>

stephanecharette commented 9 months ago

Git appends the word "dirty" if a repo has changes in it. It has nothing to do with Mint. Run git status and/or git diff to see the changes you have.

stephanecharette commented 9 months ago

I believe I managed to reproduce the problem. Fix is in progress.

stephanecharette commented 9 months ago

Please see if the latest version of Darknet combined with the latest version of DarkHelp have solved the issue.

hassoun123 commented 9 months ago

it worked, thank you so much for your help, can you tell me what was the problem

On Tue, Dec 26, 2023, 6:03 PM Stéphane Charette @.***> wrote:

Please see if the latest version of Darknet combined with the latest version of DarkHelp have solved the issue.

— Reply to this email directly, view it on GitHub https://github.com/stephanecharette/DarkMark/issues/30#issuecomment-1869634981, or unsubscribe https://github.com/notifications/unsubscribe-auth/AYPBDHN7SVBGZ2EKN5TP34DYLLYMXAVCNFSM6AAAAABBBC2IFSVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQNRZGYZTIOJYGE . You are receiving this because you authored the thread.Message ID: @.***>

stephanecharette commented 9 months ago

A change recently made to darknet. When using darknet as a library instead of a CLI tool, the GPU index number remained uninitialized at -1, which then prevented the memory allocation needed to transfer image data between the CPU and the GPU. This is what led to the segfault.

In my case, I deal mostly with virtual machines, which don't have GPUs. So I've been using the CPU version of Darknet instead of the GPU version, and thus I wasn't running into this problem.

stephanecharette / DarkMark

DarkMark crashes #30