hpc / mpifileutils

File utilities designed for scalability and performance.
https://hpc.github.io/mpifileutils
BSD 3-Clause "New" or "Revised" License
170 stars 68 forks source link

dfilemaker exiting with segmentation violation #603

Closed carbonneau1 closed 3 hours ago

carbonneau1 commented 6 days ago

using dfilemaker to create large file with deep tree renders segmentation violation errors :

using the following parameters: dfilemaker ntotal 10000 nlevels20 maxflen 27487790

[2024-11-20T11:04:14] Creating 3373 directories [2024-11-20T11:04:15] Creating 3278 files [2024-11-20T11:04:16] Writing content to files. [mutt6:1408998:0:1408998] Caught signal 11 (Segmentation fault: address not mapped to object at address 0x1aeb000) [mutt6:1408998] Process received signal [mutt6:1408998] Signal: Bus error (7) [mutt6:1408998] Signal code: (128) [mutt6:1408998] Failing at address: (nil)

also from another trace:

[2024-11-20T10:36:36] Writing content to files. Failed to open file: destpath=?j?%??m ??rGǝ?<?`?Rmf?-??F?'??0Z??= )??h?eU??bv??7;????%d?Q|?[??@HTPP?o???hb??K??mi??R?_?gUO?c???*!?? s?- errno=2 (No such file or directory) [mutt6:322022:0:322022] Caught signal 11 (Segmentation fault: Sent by the kernel at address (nil)) Segmentation fault (core dumped)

This issue needs to be addressed before we put dfilemaker in production.

ofaaland commented 6 days ago

@carbonneau1 what args and node/process counts triggered this?

ofaaland commented 20 hours ago

@carbonneau1 I'm unable to reproduce this with the current dfilemaker code, which has changed a lot with Levatin's commits and mine. Please try again and let me know if you can still reproduce the issue.

carbonneau1 commented 3 hours ago

No longer get a set violation but an error for not being able to allocate memory. This is expected. Closing the issue.