HumanSignal / label-studio-ml-backend

Configs and boilerplates for Label Studio's Machine Learning backend
Apache License 2.0
517 stars 233 forks source link

Use get_local_path failed if task is from minio #376

Open g811201 opened 10 months ago

g811201 commented 10 months ago

I built a ml backend and wanted to predict the data. I can predict data which are uploaded by label-studio UI. However, if I loaded data from Cloud Storage S3 (*I use Minio rather than s3), file_path = get_local_path(task['data'][self.value_label],hostname=HOSTNAME, access_token=API_KEY) task['data'][self.value_label] = 's3://image-classification-catdog/test/Dog_test (8).jpg'

I got: """ [2023-11-03 08:44:18,668] [ERROR] [label_studio_ml.exceptions::exception_f::53] Traceback (most recent call last): File "d:\labelstudio\label-studio-ml-backend-master\label_studio_ml\exceptions.py", line 39, in exception_f return f(*args, kwargs) ^^^^^^^^^^^^^^^^^^ File "d:\labelstudio\label-studio-ml-backend-master\label_studio_ml\api.py", line 60, in _predict predictions = model.predict(tasks, context=context, params) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "D:\labelstudio\my_example\ImageClassification\new\model.py", line 91, in predict file_path = get_local_path(task['data'][self.value_label],hostname=HOSTNAME, access_token=API_KEY) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "D:\miniforge\envs\develop\Lib\site-packages\label_studio_tools\core\utils\io.py", line 104, in get_local_path r = requests.get(url, stream=True, headers=headers) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "D:\miniforge\envs\develop\Lib\site-packages\requests\api.py", line 73, in get return request("get", url, params=params, kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "D:\miniforge\envs\develop\Lib\site-packages\requests\api.py", line 59, in request return session.request(method=method, url=url, kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "D:\miniforge\envs\develop\Lib\site-packages\requests\sessions.py", line 589, in request resp = self.send(prep, **send_kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "D:\miniforge\envs\develop\Lib\site-packages\requests\sessions.py", line 697, in send adapter = self.get_adapter(url=request.url) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "D:\miniforge\envs\develop\Lib\site-packages\requests\sessions.py", line 794, in get_adapter raise InvalidSchema(f"No connection adapters were found for {url!r}") requests.exceptions.InvalidSchema: No connection adapters were found for 's3://image-classification-catdog/test/Dog_test (8).jpg' """

I think the reason is that minio and s3 are not 100% similarity.

Is there any way to solve it?

g811201 commented 10 months ago

My temporary approach is that I build a minio client and directly get data from minio rather than label studio cache. Although this is a viable way, it need to get minio's key_id and access_key. This will be a little inconvenience because I can't get them from label-studio.