Closed nrsc closed 2 months ago
Thanks for the report, @nrsc. Could you post a list of the files (with the full path) you are attempting to organize (e.g. using the tree
command)?
Here's the output of all the files in the folder. dandi_output.txt
Hi @nrsc . Since you are a bug magnet (like yours truly), you might want to learn about an option to fall into python debugger which could potentially come out of help to troubleshoot more in the future
❯ dandi --help | grep pdb
--pdb Fall into pdb if errors out
which if you specify (e.g. dandi --pdb organize ...
) would lead you to drop into pdb debugger at that point of error on numpy.bytes_
which I fail to reproduce ATM. More on how to use pdb e.g. at https://realpython.com/python-debugging-pdb/ .
I will look now into providing more information about those non-unique paths - we must be able to provide more informative message there!
@nrsc try out
Error: 'numpy.bytes_' object has no attribute 'encode'
Thank you @yarikoptic. Will confirm effects of updates once I get the chance to sit back down with this again. Cheers all.
Hi @yarikoptic. Here is the output from pdb
I'm wondering whether the line;
2024-08-26T11:00:57-0700 [DEBUG ] dandi 784710:134806063919104 Caught exception Only 'dry' or 'move' mode could be used to operate in-place within a dandiset (no paths were provided)
can point to the origin of the error.
Unfortunately I am not so well versed in python and python debugging. I've been an R guy for a while now, but I am interested in contributing best I can and learning about this process.
Should I pull the organize.py
file that you pushed last week and try running the organize function again?
Cheers,
Scott
Hi @yarikoptic. Updated from the repository, and I now get the paths out when the error identifies the duplicated paths. That helped me identify and fix the issue. Thank you for providing the patch.
sorry I have missed your prior comment and thanks for reporting back - bring us joy to have issues closed! ;)
Hello all,
Running into an issue while trying to organize my dandi set using the cli. Getting
[ INFO] 1 out of 296 paths are not unique. We will try adding _obj- based on crc32 of object_id
, and I can't seem to locate any more information about the non-unique path in question. I've been checking the logs but there are no further details on which path may be not unique.Build info as follows.
I am using the
dandi v0.63.0+5.g37b63509
build to address metadata problem I was facing previously, but this issue comes up whether I am using the +5.g37...09 build or not. I have added more files to the dataset, so I assume that somewhere along the way I added a file that trips thenot unique
issue. Unfortunately I do not know where to look to identify where the issue stems from, as the "non-unique path" is not written into the log.Another error that shows up:
Error: 'numpy.bytes_' object has no attribute 'encode'
I will attach the log as well. 2024.08.19-16.50.59Z-647577.log
Other details regarding build information