Open smartguo opened 1 year ago
By the way, I run the python script on my mac laptop, and when I run it on Linux server, It turns out ok.
This is another example on my mac laptop. Code
import xorbits
import xorbits.remote as xr
import xorbits.pandas as pd
xorbits.init(address="http://10.***.***.42:13062")
option = {
"key": "***",
"secret": "***",
"endpoint_url": "https://cos.ap-beijing.myqcloud.com",
}
df = pd.read_parquet(
"s3://<bucket>/<path>/pt=20230403/00000.parquet",
storage_options=option,
)
print(df.head())
Error message:
Metric is not initialized, please call `init_metrics()` before using metrics.
100%|█████████| 100.00/100 [00:01<00:00, 68.90it/s]
Traceback (most recent call last):
File "/Users/***/workspace/xorbits/test_session.py", line 20, in <module>
print(df.head())
File "/Users/***/lib/anaconda3/lib/python3.9/site-packages/xorbits/utils.py", line 37, in inn
return f(self, *args, **kwargs)
File "/Users/***/lib/anaconda3/lib/python3.9/site-packages/xorbits/core/data.py", line 293, in __str__
return self.data.__str__()
File "/Users/***/lib/anaconda3/lib/python3.9/site-packages/xorbits/core/data.py", line 119, in __str__
return self._mars_entity.__str__()
File "/Users/***/lib/anaconda3/lib/python3.9/site-packages/xorbits/_mars/core/entity/core.py", line 102, in __str__
return self._data.__str__()
File "/Users/***/lib/anaconda3/lib/python3.9/site-packages/xorbits/_mars/dataframe/core.py", line 2214, in __str__
return self._to_str(representation=False)
File "/Users/***/lib/anaconda3/lib/python3.9/site-packages/xorbits/_mars/dataframe/core.py", line 2186, in _to_str
corner_data = fetch_corner_data(self, session=self._executed_sessions[-1])
File "/Users/***/lib/anaconda3/lib/python3.9/site-packages/xorbits/_mars/dataframe/utils.py", line 1242, in fetch_corner_data
return df_or_series._fetch(session=session)
File "/Users/***/lib/anaconda3/lib/python3.9/site-packages/xorbits/_mars/core/entity/executable.py", line 169, in _fetch
return fetch(self, session=session, **kw)
File "/Users/***/lib/anaconda3/lib/python3.9/site-packages/xorbits/_mars/deploy/oscar/session.py", line 1760, in fetch
return session.fetch(tileable, *tileables, **kwargs)
File "/Users/***/lib/anaconda3/lib/python3.9/site-packages/xorbits/_mars/deploy/oscar/session.py", line 1562, in fetch
return asyncio.run_coroutine_threadsafe(coro, self._loop).result()
File "/Users/***/lib/anaconda3/lib/python3.9/concurrent/futures/_base.py", line 445, in result
return self.__get_result()
File "/Users/***/lib/anaconda3/lib/python3.9/concurrent/futures/_base.py", line 390, in __get_result
raise self._exception
File "/Users/***/lib/anaconda3/lib/python3.9/site-packages/xorbits/_mars/deploy/oscar/session.py", line 1728, in _fetch
data = await session.fetch(tileable, *tileables, **kwargs)
File "/Users/***/lib/anaconda3/lib/python3.9/site-packages/xorbits/_mars/deploy/oscar/session.py", line 1019, in fetch
fetched_data = await fetcher.get()
File "/Users/***/lib/anaconda3/lib/python3.9/site-packages/xorbits/_mars/services/task/execution/mars/fetcher.py", line 60, in get
fetched_data = await storage_api.get.batch(
File "/Users/***/lib/anaconda3/lib/python3.9/site-packages/xoscar/batch.py", line 147, in _async_batch
return [await self._async_call(*args_list[0], **kwargs_list[0])]
File "/Users/***/lib/anaconda3/lib/python3.9/site-packages/xoscar/batch.py", line 96, in _async_call
return await self.func(*args, **kwargs)
File "/Users/***/lib/anaconda3/lib/python3.9/site-packages/xorbits/_mars/services/storage/api/web.py", line 138, in get
return deserialize_serializable(res.body)
File "/Users/***/lib/anaconda3/lib/python3.9/site-packages/xorbits/_mars/utils.py", line 469, in deserialize_serializable
return deserialize(header2, buffers2)
File "xoscar/serialization/core.pyx", line 862, in xoscar.serialization.core.deserialize
File "xoscar/serialization/core.pyx", line 811, in xoscar.serialization.core._deserial_single
File "xoscar/serialization/core.pyx", line 268, in xoscar.serialization.core.PickleSerializer.deserial
File "xoscar/serialization/core.pyx", line 241, in xoscar.serialization.core.unpickle_buffers
AttributeError: Can't get attribute '_unpickle_block' on <module 'pandas._libs.internals' from '/Users/***/lib/anaconda3/lib/python3.9/site-packages/pandas/_libs/internals.cpython-39-darwin.so'>
After confirming with @smartguo, the client using the mac arm chip submits the spawn task to the linux x86 server. And the python env in linux is the whole env packed on mac. Cloudpickle may have some issues in this scenario.
The python env in linux is the whole env packed on linux itself, it looks like that Cloudpickle don't support seder on different system or different arch. However, this scenario doesn't matter, using server's nookbook to develop is fine.
Describe the bug
I'm using
xorbits.init
to connect cluster deploying on yarn, then getAttributeError: Can't get attribute '_make_function'
To Reproduce
To help us to reproduce this bug, please provide information below:
Hadoop version: Hadoop 3.2.2
Code
Error message: