Open subodhdere opened 1 year ago
Hi @subodhdere , please post output of dvc doctor
Also, the log you've posted seems to be partial. Please post full log.
Hello, PSB logs of dvc doctor
singhab@jupyter-singhab-jupyter:~$ dvc doctor
DVC version: 2.58.2 (conda)
---------------------------
Platform: Python 3.8.16 on Linux-3.10.0-957.1.3.el7.x86_64-x86_64-with-glibc2.10
Subprojects:
dvc_data = 0.51.0
dvc_objects = 0.22.0
dvc_render = 0.5.3
dvc_task = 0.2.1
scmrepo = 1.0.3
Supports:
http (aiohttp = 3.8.4, aiohttp-retry = 2.8.3),
https (aiohttp = 3.8.4, aiohttp-retry = 2.8.3)
Config:
Global: /home/singhab/.config/dvc
System: /etc/xdg/dvc
Adding full logs.
singhab@jupyter-singhab-jupyter:~/dvc-example$ dvc push -r myremote -v
2023-06-29 10:42:46,833 DEBUG: v2.58.2 (conda), CPython 3.8.16 on Linux-3.10.0-957.1.3.el7.x86_64-x86_64-with-glibc2.10
2023-06-29 10:42:46,833 DEBUG: command: /opt/conda/bin/dvc push -r myremote -v
2023-06-29 10:42:47,284 DEBUG: Preparing to transfer data from '/home/singhab/dvc-example/.dvc/cache' to '/user/halo/dvc-usecase'
2023-06-29 10:42:47,284 DEBUG: Preparing to collect status from '/user/halo/dvc-usecase'
2023-06-29 10:42:47,284 DEBUG: Collecting status from '/user/halo/dvc-usecase'
2023-06-29 10:42:47,285 DEBUG: Querying 1 oids via object_exists
0% Checking cache in '/user/halo/dvc-usecase'| |0/? [00:00<?, ?files/s]loadFileSystems error:
(unable to get root cause for java.lang.NoClassDefFoundError)
(unable to get stack trace for java.lang.NoClassDefFoundError)
hdfsBuilderConnect(forceNewInstance=1, nn=hdfs://am01.halo-telekom.com, port=8020, kerbTicketCachePath=FILE:/tmp/krb5cc_1000, userName=singhab@HALO-TELEKOM.COM) error:
(unable to get root cause for java.lang.NoClassDefFoundError)
(unable to get stack trace for java.lang.NoClassDefFoundError)
/arrow/cpp/src/arrow/status.cc:137: Failed to disconnect hdfs client: IOError: HDFS hdfsFS::Disconnect failed. Detail: [errno 9] Bad file descriptor
2023-06-29 10:42:47,541 ERROR: unexpected error - HDFS connection failed
Traceback (most recent call last):
File "/opt/conda/lib/python3.8/site-packages/dvc/cli/__init__.py", line 210, in main
ret = cmd.do_run()
File "/opt/conda/lib/python3.8/site-packages/dvc/cli/command.py", line 26, in do_run
return self.run()
File "/opt/conda/lib/python3.8/site-packages/dvc/commands/data_sync.py", line 60, in run
processed_files_count = self.repo.push(
File "/opt/conda/lib/python3.8/site-packages/dvc/repo/__init__.py", line 65, in wrapper
return f(repo, *args, **kwargs)
File "/opt/conda/lib/python3.8/site-packages/dvc/repo/push.py", line 92, in push
result = self.cloud.push(
File "/opt/conda/lib/python3.8/site-packages/dvc/data_cloud.py", line 154, in push
return self.transfer(
File "/opt/conda/lib/python3.8/site-packages/dvc/data_cloud.py", line 135, in transfer
return transfer(src_odb, dest_odb, objs, **kwargs)
File "/opt/conda/lib/python3.8/site-packages/dvc_data/hashfile/transfer.py", line 203, in transfer
status = compare_status(
File "/opt/conda/lib/python3.8/site-packages/dvc_data/hashfile/status.py", line 178, in compare_status
dest_exists, dest_missing = status(
File "/opt/conda/lib/python3.8/site-packages/dvc_data/hashfile/status.py", line 149, in status
odb.oids_exist(hashes, jobs=jobs, progress=pbar.callback)
File "/opt/conda/lib/python3.8/site-packages/dvc_objects/db.py", line 406, in oids_exist
return list(wrap_iter(remote_oids, callback))
File "/opt/conda/lib/python3.8/site-packages/dvc_objects/db.py", line 36, in wrap_iter
for index, item in enumerate(iterable, start=1):
File "/opt/conda/lib/python3.8/site-packages/dvc_objects/db.py", line 354, in list_oids_exists
in_remote = self.fs.exists(paths, batch_size=jobs)
File "/opt/conda/lib/python3.8/site-packages/dvc_objects/fs/base.py", line 352, in exists
if self.fs.async_impl:
File "/opt/conda/lib/python3.8/site-packages/funcy/objects.py", line 47, in __get__
return prop.__get__(instance, type)
File "/opt/conda/lib/python3.8/site-packages/funcy/objects.py", line 25, in __get__
res = instance.__dict__[self.fget.__name__] = self.fget(instance)
File "/opt/conda/lib/python3.8/site-packages/dvc_hdfs/__init__.py", line 58, in fs
return HadoopFileSystem(**self.fs_args)
File "/opt/conda/lib/python3.8/site-packages/fsspec/spec.py", line 79, in __call__
obj = super().__call__(*args, **kwargs)
File "/opt/conda/lib/python3.8/site-packages/fsspec/implementations/arrow.py", line 278, in __init__
fs = HadoopFileSystem(
File "pyarrow/_hdfs.pyx", line 95, in pyarrow._hdfs.HadoopFileSystem.__init__
File "pyarrow/error.pxi", line 144, in pyarrow.lib.pyarrow_internal_check_status
File "pyarrow/error.pxi", line 115, in pyarrow.lib.check_status
OSError: HDFS connection failed
2023-06-29 10:42:47,630 DEBUG: link type reflink is not available ([Errno 95] no more link types left to try out)
2023-06-29 10:42:47,630 DEBUG: Removing '/home/singhab/.FxZzopJoqjfYPoAjwtVGz6.tmp'
2023-06-29 10:42:47,636 DEBUG: Removing '/home/singhab/.FxZzopJoqjfYPoAjwtVGz6.tmp'
2023-06-29 10:42:47,645 DEBUG: Removing '/home/singhab/.FxZzopJoqjfYPoAjwtVGz6.tmp'
2023-06-29 10:42:47,650 DEBUG: Removing '/home/singhab/dvc-example/.dvc/cache/.hEYTKG7bHugmicwMbkzNsk.tmp'
2023-06-29 10:42:47,665 DEBUG: Version info for developers:
DVC version: 2.58.2 (conda)
---------------------------
Platform: Python 3.8.16 on Linux-3.10.0-957.1.3.el7.x86_64-x86_64-with-glibc2.10
Subprojects:
dvc_data = 0.51.0
dvc_objects = 0.22.0
dvc_render = 0.5.3
dvc_task = 0.2.1
scmrepo = 1.0.3
Supports:
hdfs (fsspec = 2023.6.0, pyarrow = 12.0.0),
http (aiohttp = 3.8.4, aiohttp-retry = 2.8.3),
https (aiohttp = 3.8.4, aiohttp-retry = 2.8.3),
s3 (s3fs = 2023.6.0, boto3 = 1.26.161)
Config:
Global: /home/singhab/.config/dvc
System: /etc/xdg/dvc
Cache types: hardlink, symlink
Cache directory: nfs4 on fs-a461edff.efs.eu-central-1.amazonaws.com:/halo-claim-singhab-jupyter-pvc-2486a8de-d78a-4515-8ea0-cf2fe89befe2
Caches: local
Remotes: hdfs, s3
Workspace directory: nfs4 on fs-a461edff.efs.eu-central-1.amazonaws.com:/halo-claim-singhab-jupyter-pvc-2486a8de-d78a-4515-8ea0-cf2fe89befe2
Repo: dvc, git
Repo.site_cache_dir: /var/tmp/dvc/repo/e621ece895c6241383df59f56935951d
Having any troubles? Hit us up at https://dvc.org/support, we are always happy to help!
2023-06-29 10:42:47,669 DEBUG: Analytics is enabled.
2023-06-29 10:42:47,708 DEBUG: Trying to spawn '['daemon', '-q', 'analytics', '/tmp/tmpdlov741k']'
2023-06-29 10:42:47,711 DEBUG: Spawned '['daemon', '-q', 'analytics', '/tmp/tmpdlov741k']'
Looks like something is with your credentials/config. Does hdfs CLI work?
Hello, HDFS cli is working. Also shared .dvc/config file for more info.
singhab@jupyter-singhab-jupyter:~/dvc-example$ hdfs dfs -ls hdfs://am01.halo-telekom.com:8020/user/halo/dvc-usecase
Found 1 items
-rw-rw-r--+ 3 singhab hadoop 0 2023-06-15 09:45 hdfs://am01.halo-telekom.com:8020/user/halo/dvc-usecase/test.txt
===================================================
singhab@jupyter-singhab-jupyter:~/dvc-example$ cat .dvc/config
[core]
remote = myremotes3
['remote "myremote"']
url = hdfs://am01.halo-telekom.com:8020/user/halo/dvc-usecase
user = singhab
@subodhdere So what credentials are you using and how? kerberos maybe?
Overall seems like a configuration issue.
Hello Team, we are using kerberos for authentication.
@subodhdere You need to specify kerb ticket, see https://dvc.org/doc/user-guide/data-management/remote-storage/hdfs#hdfs-configuration-parameters
Hello Team, I have executed provided commands related to Kerberos authentication.
dvc remote modify --local myremote kerb_ticket FILE:/tmp/krb5cc_1000 dvc remote add -d myremote hdfs://am01.halo-telekom.com:8020/user/halo/dvc-usecase dvc remote modify --local myremote user "singhab@HALO-TELEKOM.COM"
You can refer below message in error.
hdfsBuilderConnect(forceNewInstance=1, nn=hdfs://
@subodhdere Seems like the error is cut off.
Hello Team, Can you please look for below full error.
singhab@jupyter-singhab-jupyter:~/dvc-example$ dvc push -r myremote -v
2023-06-29 10:42:46,833 DEBUG: v2.58.2 (conda), CPython 3.8.16 on Linux-3.10.0-957.1.3.el7.x86_64-x86_64-with-glibc2.10
2023-06-29 10:42:46,833 DEBUG: command: /opt/conda/bin/dvc push -r myremote -v
2023-06-29 10:42:47,284 DEBUG: Preparing to transfer data from '/home/singhab/dvc-example/.dvc/cache' to '/user/halo/dvc-usecase'
2023-06-29 10:42:47,284 DEBUG: Preparing to collect status from '/user/halo/dvc-usecase'
2023-06-29 10:42:47,284 DEBUG: Collecting status from '/user/halo/dvc-usecase'
2023-06-29 10:42:47,285 DEBUG: Querying 1 oids via object_exists
0% Checking cache in '/user/halo/dvc-usecase'| |0/? [00:00<?, ?files/s]loadFileSystems error:
(unable to get root cause for java.lang.NoClassDefFoundError)
(unable to get stack trace for java.lang.NoClassDefFoundError)
hdfsBuilderConnect(forceNewInstance=1, nn=hdfs://am01.halo-telekom.com, port=8020, kerbTicketCachePath=FILE:/tmp/krb5cc_1000, userName=singhab@HALO-TELEKOM.COM) error:
(unable to get root cause for java.lang.NoClassDefFoundError)
(unable to get stack trace for java.lang.NoClassDefFoundError)
/arrow/cpp/src/arrow/status.cc:137: Failed to disconnect hdfs client: IOError: HDFS hdfsFS::Disconnect failed. Detail: [errno 9] Bad file descriptor
2023-06-29 10:42:47,541 ERROR: unexpected error - HDFS connection failed
Traceback (most recent call last):
File "/opt/conda/lib/python3.8/site-packages/dvc/cli/__init__.py", line 210, in main
ret = cmd.do_run()
File "/opt/conda/lib/python3.8/site-packages/dvc/cli/command.py", line 26, in do_run
return self.run()
File "/opt/conda/lib/python3.8/site-packages/dvc/commands/data_sync.py", line 60, in run
processed_files_count = self.repo.push(
File "/opt/conda/lib/python3.8/site-packages/dvc/repo/__init__.py", line 65, in wrapper
return f(repo, *args, **kwargs)
File "/opt/conda/lib/python3.8/site-packages/dvc/repo/push.py", line 92, in push
result = self.cloud.push(
File "/opt/conda/lib/python3.8/site-packages/dvc/data_cloud.py", line 154, in push
return self.transfer(
File "/opt/conda/lib/python3.8/site-packages/dvc/data_cloud.py", line 135, in transfer
return transfer(src_odb, dest_odb, objs, **kwargs)
File "/opt/conda/lib/python3.8/site-packages/dvc_data/hashfile/transfer.py", line 203, in transfer
status = compare_status(
File "/opt/conda/lib/python3.8/site-packages/dvc_data/hashfile/status.py", line 178, in compare_status
dest_exists, dest_missing = status(
File "/opt/conda/lib/python3.8/site-packages/dvc_data/hashfile/status.py", line 149, in status
odb.oids_exist(hashes, jobs=jobs, progress=pbar.callback)
File "/opt/conda/lib/python3.8/site-packages/dvc_objects/db.py", line 406, in oids_exist
return list(wrap_iter(remote_oids, callback))
File "/opt/conda/lib/python3.8/site-packages/dvc_objects/db.py", line 36, in wrap_iter
for index, item in enumerate(iterable, start=1):
File "/opt/conda/lib/python3.8/site-packages/dvc_objects/db.py", line 354, in list_oids_exists
in_remote = self.fs.exists(paths, batch_size=jobs)
File "/opt/conda/lib/python3.8/site-packages/dvc_objects/fs/base.py", line 352, in exists
if self.fs.async_impl:
File "/opt/conda/lib/python3.8/site-packages/funcy/objects.py", line 47, in __get__
return prop.__get__(instance, type)
File "/opt/conda/lib/python3.8/site-packages/funcy/objects.py", line 25, in __get__
res = instance.__dict__[self.fget.__name__] = self.fget(instance)
File "/opt/conda/lib/python3.8/site-packages/dvc_hdfs/__init__.py", line 58, in fs
return HadoopFileSystem(**self.fs_args)
File "/opt/conda/lib/python3.8/site-packages/fsspec/spec.py", line 79, in __call__
obj = super().__call__(*args, **kwargs)
File "/opt/conda/lib/python3.8/site-packages/fsspec/implementations/arrow.py", line 278, in __init__
fs = HadoopFileSystem(
File "pyarrow/_hdfs.pyx", line 95, in pyarrow._hdfs.HadoopFileSystem.__init__
File "pyarrow/error.pxi", line 144, in pyarrow.lib.pyarrow_internal_check_status
File "pyarrow/error.pxi", line 115, in pyarrow.lib.check_status
OSError: HDFS connection failed
2023-06-29 10:42:47,630 DEBUG: link type reflink is not available ([Errno 95] no more link types left to try out)
2023-06-29 10:42:47,630 DEBUG: Removing '/home/singhab/.FxZzopJoqjfYPoAjwtVGz6.tmp'
2023-06-29 10:42:47,636 DEBUG: Removing '/home/singhab/.FxZzopJoqjfYPoAjwtVGz6.tmp'
2023-06-29 10:42:47,645 DEBUG: Removing '/home/singhab/.FxZzopJoqjfYPoAjwtVGz6.tmp'
2023-06-29 10:42:47,650 DEBUG: Removing '/home/singhab/dvc-example/.dvc/cache/.hEYTKG7bHugmicwMbkzNsk.tmp'
2023-06-29 10:42:47,665 DEBUG: Version info for developers:
DVC version: 2.58.2 (conda)
---------------------------
Platform: Python 3.8.16 on Linux-3.10.0-957.1.3.el7.x86_64-x86_64-with-glibc2.10
Subprojects:
dvc_data = 0.51.0
dvc_objects = 0.22.0
dvc_render = 0.5.3
dvc_task = 0.2.1
scmrepo = 1.0.3
Supports:
hdfs (fsspec = 2023.6.0, pyarrow = 12.0.0),
http (aiohttp = 3.8.4, aiohttp-retry = 2.8.3),
https (aiohttp = 3.8.4, aiohttp-retry = 2.8.3),
s3 (s3fs = 2023.6.0, boto3 = 1.26.161)
Config:
Global: /home/singhab/.config/dvc
System: /etc/xdg/dvc
Cache types: hardlink, symlink
Cache directory: nfs4 on fs-a461edff.efs.eu-central-1.amazonaws.com:/halo-claim-singhab-jupyter-pvc-2486a8de-d78a-4515-8ea0-cf2fe89befe2
Caches: local
Remotes: hdfs, s3
Workspace directory: nfs4 on fs-a461edff.efs.eu-central-1.amazonaws.com:/halo-claim-singhab-jupyter-pvc-2486a8de-d78a-4515-8ea0-cf2fe89befe2
Repo: dvc, git
Repo.site_cache_dir: /var/tmp/dvc/repo/e621ece895c6241383df59f56935951d
Having any troubles? Hit us up at https://dvc.org/support, we are always happy to help!
2023-06-29 10:42:47,669 DEBUG: Analytics is enabled.
2023-06-29 10:42:47,708 DEBUG: Trying to spawn '['daemon', '-q', 'analytics', '/tmp/tmpdlov741k']'
2023-06-29 10:42:47,711 DEBUG: Spawned '['daemon', '-q', 'analytics', '/tmp/tmpdlov741k']'
@subodhdere @efiop Hi! I have the same one error. Any updates? @subodhdere How did you deal with the error?
My traceback:
loadFileSystems error:
(unable to get root cause for java.lang.NoClassDefFoundError)
(unable to get stack trace for java.lang.NoClassDefFoundError)
hdfsBuilderConnect(forceNewInstance=1, nn=hdfs://cdp, port=8020, kerbTicketCachePath=FILE:/tmp/krb5cc_298426831_298426831, userName=user05) error:
(unable to get root cause for java.lang.NoClassDefFoundError)
(unable to get stack trace for java.lang.NoClassDefFoundError)
/arrow/cpp/src/arrow/status.cc:137: Failed to disconnect hdfs client: IOError: HDFS hdfsFS::Disconnect failed. Detail: [errno 9] Bad file descriptor
ERROR: unexpected error - HDFS connection failed
Bug Report
Issue name
DVC tool to push data to HDFS
dvc push -r myremote -v
Description
Getting below error while pushing data to HDFS.
Reproduce
dvc push -r myremote -v
Expected
Data should be pushed to HDFS.
Output of
dvc doctor
:Additional Information (if any):