Closed hassoun123 closed 9 months ago
This is not enough details for me to determine what might have gone wrong. At the very least, you need to look at the log file and see where the crash happened, and the call stack that gets logged.
The output from Valgrind indicates that there are memory allocation issues within the DarkMark application, specifically within functions that are part of the OpenCV library (libopencv_core.so) and the DarkHelp::NN class of DarkMark. The possibly lost memory bytes suggest that there are allocations that were not freed, which can be indicative of memory leaks. and the log indicates that DarkMark is experiencing a segmentation fault.
User
2023-12-23 19:19:57 finding all images and markup files in /home/hassan/YOLOv4/Paddle
2023-12-23 19:19:57 number of images found in /home/hassan/YOLOv4/Paddle: 3955
2023-12-23 19:19:57 loading darknet neural network
2023-12-23 19:19:57 attempting to load neural network /home/hassan/YOLOv4/Paddle/Paddle.cfg / /home/hassan/YOLOv4/Paddle/Paddle_best.weights / /home/hassan/YOLOv4/Paddle/Paddle.names
2023-12-23 19:19:57 neural network loaded in 166.778 milliseconds
2023-12-23 19:19:57 number of name entries: 1
2023-12-23 19:19:57 aborting due to signal: "Segmentation fault" [signal #11]
2023-12-23 19:19:57 backtrace #0: ./DarkMark: get_backtrace[abi:cxx11]() +0x4f [0x56505d617ccf]
2023-12-23 19:19:57 backtrace #1: ./DarkMark: dm::DarkMarkApplication::signal_handler(int) +0x1ca [0x56505d61908a]
2023-12-23 19:19:57 backtrace #2: /lib/x86_64-linux-gnu/libc.so.6(+0x42520) [0x7f353aeb1520]
2023-12-23 19:19:57 backtrace #3: /lib/x86_64-linux-gnu/libc.so.6(+0x1af84e) [0x7f353b01e84e]
2023-12-23 19:19:57 backtrace #4: /lib/libdarknet.so: im2col_cpu_ext +0x65e [0x7f353c2d57fe]
2023-12-23 19:19:57 backtrace #5: /lib/libdarknet.so: forward_convolutional_layer +0x165 [0x7f353c258b15]
2023-12-23 19:19:57 backtrace #6: /lib/libdarknet.so: forward_network +0x87 [0x7f353c2fa8f7]
2023-12-23 19:19:57 backtrace #7: /lib/libdarknet.so: network_predict +0x87 [0x7f353c2fc127]
2023-12-23 19:19:57 backtrace #8: ./DarkMark: DarkHelp::NN::predict_internal_darknet() +0x23f [0x56505da53acf]
2023-12-23 19:19:57 backtrace #9: ./DarkMark: DarkHelp::NN::predict_internal(cv::Mat, float) +0x2e2 [0x56505da590a2]
2023-12-23 19:19:57 backtrace #10: ./DarkMark: DarkHelp::NN::predict(cv::Mat, float) +0xa7 [0x56505da5b4d7]
2023-12-23 19:19:57 backtrace #11: ./DarkMark: dm::DMContent::load_image(unsigned long, bool, bool) +0x76b [0x56505d6a431b]
2023-12-23 19:19:57 backtrace #12: ./DarkMark: dm::DMContent::set_sort_order(dm::ESort) +0x171 [0x56505d6a4fb1]
2023-12-23 19:19:57 backtrace #13: ./DarkMark: dm::DMContent::start_darknet() +0x525 [0x56505d6a57c5]
2023-12-23 19:19:57 backtrace #14: ./DarkMark: dm::DMWnd::DMWnd(std::__cxx11::basic_string<char, std::char_traits
note that this happens only when i press the load button after creating the darknet files and training the model
I and the other people on the Darknet/YOLO discord have zero crashes or problems with DarkMark. It is working well for us.
If you'd like me to dig into the problem you are seeing, then we'd need more details, or some files we can load to replicate the problem you are seeing. Without being able to replicate the problem, we're 100% reliant on your description to reproduce the issue.
Here is a screenshot I took just now of DarkMark v1.8.18-1 -- the latest version -- running on my rig where I'm training a new network today. So I'm 100% certain that it does work.
Just out of curiosity, which version of Darknet are you using? You should be using the latest from this repo: https://github.com/hank-ai/darknet#table-of-contents
darknet version is v2.0-63-g315c57b6-dirty this was the one installed by default from the GitHub repository. available versions are v1.99 and v2.0
On Mon, Dec 25, 2023, 12:19 AM Stéphane Charette @.***> wrote:
I and the other people on the Darknet/YOLO discord have zero crashes or problems with DarkMark. It is working well for us.
If you'd like me to dig into the problem you are seeing, then we'd need more details, or some files we can load to replicate the problem you are seeing. Without being able to replicate the problem, we're 100% reliant on your description to reproduce the issue.
Here is a screenshot I took just now of DarkMark v1.8.18-1 -- the latest version -- running on my rig where I'm training a new network today. So I'm 100% certain that it does work. image.png (view on web) https://github.com/stephanecharette/DarkMark/assets/5061352/6e079a3c-95eb-4266-be13-ed41566dd167
Just out of curiosity, which version of Darknet are you using? You should be using the latest from this repo: https://github.com/hank-ai/darknet#table-of-contents
— Reply to this email directly, view it on GitHub https://github.com/stephanecharette/DarkMark/issues/30#issuecomment-1868602905, or unsubscribe https://github.com/notifications/unsubscribe-auth/AYPBDHJPZPNEC6XJVHUSHNDYLCS5RAVCNFSM6AAAAABBBC2IFSVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQNRYGYYDEOJQGU . You are receiving this because you authored the thread.Message ID: @.***>
The "dirty" in your version string shows that you made local modifications to darknet. What changes did you make? Run "git status" and/or "git diff" to see what you modified.
i didn't modify anything, but I'm using Linux mint, idk if that's an issue
On Mon, Dec 25, 2023, 4:57 PM Stéphane Charette @.***> wrote:
The "dirty" in your version string shows that you made local modifications to darknet. What changes did you make? Run "git status" and/or "git diff" to see what you modified.
— Reply to this email directly, view it on GitHub https://github.com/stephanecharette/DarkMark/issues/30#issuecomment-1869017321, or unsubscribe https://github.com/notifications/unsubscribe-auth/AYPBDHMPFVJQAIUZN2MOXU3YLGH6ZAVCNFSM6AAAAABBBC2IFSVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQNRZGAYTOMZSGE . You are receiving this because you authored the thread.Message ID: @.***>
Git appends the word "dirty" if a repo has changes in it. It has nothing to do with Mint. Run git status
and/or git diff
to see the changes you have.
I believe I managed to reproduce the problem. Fix is in progress.
Please see if the latest version of Darknet combined with the latest version of DarkHelp have solved the issue.
it worked, thank you so much for your help, can you tell me what was the problem
On Tue, Dec 26, 2023, 6:03 PM Stéphane Charette @.***> wrote:
Please see if the latest version of Darknet combined with the latest version of DarkHelp have solved the issue.
— Reply to this email directly, view it on GitHub https://github.com/stephanecharette/DarkMark/issues/30#issuecomment-1869634981, or unsubscribe https://github.com/notifications/unsubscribe-auth/AYPBDHN7SVBGZ2EKN5TP34DYLLYMXAVCNFSM6AAAAABBBC2IFSVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQNRZGYZTIOJYGE . You are receiving this because you authored the thread.Message ID: @.***>
A change recently made to darknet. When using darknet as a library instead of a CLI tool, the GPU index number remained uninitialized at -1, which then prevented the memory allocation needed to transfer image data between the CPU and the GPU. This is what led to the segfault.
In my case, I deal mostly with virtual machines, which don't have GPUs. So I've been using the CPU version of Darknet instead of the GPU version, and thus I wasn't running into this problem.
although the training process is finished, weights and cfg files are all good with correct paths, whenever i try to load after having the best weights and cfg files DarkMark crashes