datalad / datalad-fuse

DataLad extension to provide FUSE file system access
Other
1 stars 4 forks source link

Doesn't provide access to files in openneuro subdataset #118

Open yarikoptic opened 3 months ago

yarikoptic commented 3 months ago
$> datalad fusefs -d $PWD --mode-transparent --foreground /tmp/openneuro-fuse
Exception ignored on calling ctypes callback function: functools.partial(<function FUSE._wrapper at 0x7f689a88eca0>, <bound method FUSE.open of <fuse.FUSE object at 0x7f689a888f90>>)
Traceback (most recent call last):
  File "/home/yoh/venvs/dev3/lib/python3.11/site-packages/fuse.py", line 756, in _wrapper
    self.__critical_exception = e
    ^^^^
NameError: name 'self' is not defined
fuse: bad error value: 1657291314
Exception ignored on calling ctypes callback function: functools.partial(<function FUSE._wrapper at 0x7f689a88eca0>, <bound method FUSE.open of <fuse.FUSE object at 0x7f689a888f90>>)
Traceback (most recent call last):
  File "/home/yoh/venvs/dev3/lib/python3.11/site-packages/fuse.py", line 756, in _wrapper
    self.__critical_exception = e
    ^^^^
NameError: name 'self' is not defined
fuse: bad error value: 1657291314
Exception ignored on calling ctypes callback function: functools.partial(<function FUSE._wrapper at 0x7f689a88eca0>, <bound method FUSE.open of <fuse.FUSE object at 0x7f689a888f90>>)
Traceback (most recent call last):
  File "/home/yoh/venvs/dev3/lib/python3.11/site-packages/fuse.py", line 756, in _wrapper
    self.__critical_exception = e
    ^^^^
NameError: name 'self' is not defined

whenever I just trying to nib-ls that file:

(git)smaug:/tmp/openneuro-fuse/ds000001[master]git
$> nib-ls sub-02/anat/sub-02_inplaneT2.nii.gz
sub-02/anat/sub-02_inplaneT2.nii.gz failed

fsspec 2024.5.0

$> apt policy libfuse3-3
libfuse3-3:
  Installed: 3.14.0-4
  Candidate: 3.14.0-4
  Version table:
 *** 3.14.0-4 100
        100 http://debian.osuosl.org/debian bookworm/main amd64 Packages
        100 /var/lib/dpkg/status
nothing immediate clear from debug output ``` [DEBUG ] Starting new runner for BatchedAnnex(command=['git', '-c', 'diff.ignoreSubmodules=none', '-c', 'core.quotepath=false', 'annex', 'examinekey', '--format=annex/objects/${hashdirmixed}${key}/${key}\\n', '--batch', '--debug'], encoding=None, exception_on_timeout=False, last_request=None, output_proc=None, path=/mnt/btrfs/datasets/datalad/crawl/openneuro/ds000001, return_code=None, runner=None, stderr_output=b'', timeout=None, wait_timed_out=None) [Level 5] Command: ['git', '-c', 'diff.ignoreSubmodules=none', '-c', 'core.quotepath=false', 'annex', 'examinekey', '--format=annex/objects/${hashdirmixed}${key}/${key}\\n', '--batch', '--debug'] [DEBUG ] Run ['git', '-c', 'diff.ignoreSubmodules=none', '-c', 'core.quotepath=false', 'annex', 'examinekey', '--format=annex/objects/${hashdirmixed}${key}/${key}\\n', '--batch', '--debug'] (protocol_class=BatchedCommandProtocol) (cwd=/mnt/btrfs/datasets/datalad/crawl/openneuro/ds000001) [Level 8] Process 165991 started [Level 5] STDERR: git -c diff.ignoreSu (Thread<(STDERR: git -c diff.ignoreSu, 11)>) started [Level 5] STDOUT: git -c diff.ignoreSu (Thread<(STDOUT: git -c diff.ignoreSu, 9)>) started [Level 5] STDIN: git -c diff.ignoreSu (Thread<(STDIN: git -c diff.ignoreSu, 8)>) started [Level 5] process_waiter (Thread) started [DEBUG ] Starting new runner for BatchedAnnex(command=['git', '-c', 'diff.ignoreSubmodules=none', '-c', 'core.quotepath=false', 'annex', 'examinekey', '--format=annex/objects/${hashdirlower}${key}/${key}\\n', '--batch', '--debug'], encoding=None, exception_on_timeout=False, last_request=None, output_proc=None, path=/mnt/btrfs/datasets/datalad/crawl/openneuro/ds000001, return_code=None, runner=None, stderr_output=b'', timeout=None, wait_timed_out=None) [Level 5] Command: ['git', '-c', 'diff.ignoreSubmodules=none', '-c', 'core.quotepath=false', 'annex', 'examinekey', '--format=annex/objects/${hashdirlower}${key}/${key}\\n', '--batch', '--debug'] [DEBUG ] Run ['git', '-c', 'diff.ignoreSubmodules=none', '-c', 'core.quotepath=false', 'annex', 'examinekey', '--format=annex/objects/${hashdirlower}${key}/${key}\\n', '--batch', '--debug'] (protocol_class=BatchedCommandProtocol) (cwd=/mnt/btrfs/datasets/datalad/crawl/openneuro/ds000001) [Level 8] Process 166016 started [Level 5] STDERR: git -c diff.ignoreSu (Thread<(STDERR: git -c diff.ignoreSu, 14)>) started [Level 5] STDOUT: git -c diff.ignoreSu (Thread<(STDOUT: git -c diff.ignoreSu, 12)>) started [Level 5] STDIN: git -c diff.ignoreSu (Thread<(STDIN: git -c diff.ignoreSu, 10)>) started [Level 5] process_waiter (Thread) started Exception ignored on calling ctypes callback function: functools.partial(, >) Traceback (most recent call last): File "/home/yoh/venvs/dev3/lib/python3.11/site-packages/fuse.py", line 756, in _wrapper self.__critical_exception = e ^^^^ NameError: name 'self' is not defined fuse: bad error value: 1657291314 [DEBUG ] op=readlink for path=/sub-02/anat/sub-02_inplaneT2.nii.gz with args () [DEBUG ] readlink(path='/mnt/btrfs/datasets/datalad/crawl/openneuro/ds000001/sub-02/anat/sub-02_inplaneT2.nii.gz') [DEBUG ] op=readlink for path=/sub-02/anat/sub-02_inplaneT2.nii.gz with args () [DEBUG ] readlink(path='/mnt/btrfs/datasets/datalad/crawl/openneuro/ds000001/sub-02/anat/sub-02_inplaneT2.nii.gz') [DEBUG ] op=access for path=/.git/annex/objects/49/7G/MD5E-s718835--da6846d3f0a6c0fe29bb1bf22c796354.nii.gz/MD5E-s718835--da6846d3f0a6c0fe29bb1bf22c796354.nii.gz with args (2,) [DEBUG ] op=readlink for path=/sub-02/anat/sub-02_inplaneT2.nii.gz with args () [DEBUG ] readlink(path='/mnt/btrfs/datasets/datalad/crawl/openneuro/ds000001/sub-02/anat/sub-02_inplaneT2.nii.gz') [DEBUG ] op=access for path=/.git/annex/objects/49/7G/MD5E-s718835--da6846d3f0a6c0fe29bb1bf22c796354.nii.gz/MD5E-s718835--da6846d3f0a6c0fe29bb1bf22c796354.nii.gz with args (1,) ```
yarikoptic commented 3 months ago

yikes -- we do not have s3-PUBLIC enabled

(git)smaug:/mnt/btrfs/datasets/datalad/crawl/openneuro/ds000001[master]git
$> git annex whereis sub-02/anat/sub-02_inplaneT2.nii.gz
whereis sub-02/anat/sub-02_inplaneT2.nii.gz (2 copies)
        8d2b6e96-ad81-44a5-99b4-0ec37d6b3800 -- s3-PUBLIC
        b5dd2e3d-825f-4bc2-b719-cba1059f6bfc -- root@93184394ac19:/datalad/ds000001
ok

and thus no urls.

We need to log ERROR or WARNING on that whenever no URLs available!

yarikoptic commented 3 months ago

the reason reported to https://git-annex.branchable.com/bugs/multiple_records_in_remote.log_for_the_same_remote/?updated#comment-ac98becd2436d7cd398fcb1ad7c7147f and now mitigated locally

(git)smaug:/mnt/btrfs/datasets/datalad/crawl/openneuro[master]git
$> for ds in ds*; do nr=$(git -C $ds show git-annex:remote.log | grep 's3-PUBLIC ' | wc -l); if [ "$nr" = "1" ]; then continue; fi ; echo $ds; git -C $ds remote remove s3-PUBLIC; git -C $ds annex enableremote s3-PUBLIC;  done
ds000006
enableremote s3-PUBLIC ok
(recording state in git...)
ds000007
enableremote s3-PUBLIC ok
(recording state in git...)
ds000008
enableremote s3-PUBLIC ok
(recording state in git...)
ds000009
enableremote s3-PUBLIC ok
(recording state in git...)
ds000011
enableremote s3-PUBLIC ok
(recording state in git...)
ds000017
enableremote s3-PUBLIC ok
(recording state in git...)
ds000030
error: No such remote: 's3-PUBLIC'
enableremote s3-PUBLIC ok
(recording state in git...)
ds000031
...

so apparently we did not even have s3-PUBLIC enabled in some of those at all...