HumanSignal / label-studio

Label Studio is a multi-type data labeling and annotation tool with standardized output format
https://labelstud.io
Apache License 2.0
18.39k stars 2.31k forks source link

How could I load image from local Source Storage? #774

Closed li3cmz closed 3 years ago

li3cmz commented 3 years ago

Hi, I have followed the doc here to create a local storage. a

And then I start to use the local storage by two steps.

  1. Put .*png file into the local path that I select above.
  2. Import my tasks through the json file below with the image value("original_image") inside the red box. b

However, I failed to load the image through this way as below. c

Could you help me with the bug? Looking forward to your reply.

makseq commented 3 years ago

Hi! For Local files storage you have to use /data/local-files/1.png.

makseq commented 3 years ago

@smoreface Please, could you add to this doc section https://labelstud.io/guide/tasks.html#Import-data-from-a-local-directory a task example with the correct path to file (and maybe some note about /data/local-files/ path), like:


{
 "data": {
    "image": "/data/local-files/1.jpg"
  }
}
li3cmz commented 3 years ago

Hi! For Local files storage you have to use /data/local-files/1.png.

Hi, I tried your suggestion but failed again as below. e

I want to know the requirement of setting the local path while adding the source storage. It is the absolute path of starting from the home directory as below. d

Morgadoooo commented 3 years ago

Hi,

Isn't it suppose to be /data/local-files/?d=1.png Or am i misunderstanding something

@makseq @li3cmz

makseq commented 3 years ago

@Morgadoooo yes, of course. You are right!

Morgadoooo commented 3 years ago

Hi! For Local files storage you have to use /data/local-files/1.png.

Hi, I tried your suggestion but failed again as below. e

I want to know the requirement of setting the local path while adding the source storage. It is the absolute path of starting from the home directory as below. d

Hi, if i'm not mistaking i got this info from : core.settings.base

# Build paths inside the project like this: os.path.join(BASE_DIR, ...)
BASE_DIR = os.path.dirname(os.path.dirname(os.path.abspath(__file__)))

So i think you can probably specify path regarding this info of BASE_DIR

Edit : Nvm I misunderstood it here is the code for get_data for local files (label_studio/io_storages/localfiles/models.py) :

document_root = Path(get_env('LOCAL_FILES_DOCUMENT_ROOT', default='/'))
relative_path = str(path.relative_to(document_root))

so regarding this : If you didn't define a LABEL_STUDIO_LOCAL_FILES_DOCUMENT_ROOT env variable the base path will be /

li3cmz commented 3 years ago

Hi, if i'm not mistaking, I think that there maybe some inappropriate and misleading codes for get_data.

Maybe the environment variable LOCAL_FILES_DOCUMENT_ROOT should be replaced by self.path which is received from the Label Studio UI. And then we don't have to set the LOCAL_FILES_DOCUMENT_ROOT again.

At first I thought that there was no need to set LOCAL_FILES_DOCUMENT_ROOT again after setting Local path in the UI of Edit Source Storage, which caused the error that I mentioned above.

And the error appeared after I set the variable use_blob_urls to be true as the screenshot shown below. WXWorkCapture_16185623512878

Maybe the problem could be solved by more specific documentation for LOCAL_FILES_DOCUMENT_ROOT or changing the code.

zakajd commented 3 years ago

@li3cmz Did you manage to import local files without errors? What did you change in the initial example?

dsantiago commented 2 years ago

Man i was liking the tool, but just to import a local folder is such a pain... we need to have a flag to bypass configuration where i already have the images and labels locally.

savhascelik commented 1 year ago

@dsantiago absolutely I agree but I found a solution like this. First of all, after setting the local source, selecting the "Treat every bucket object as a source file" option and including the files -"Sync Storage"- I learn the path of the files and change the paths of the json files I include.

AlcobaAI commented 1 year ago

I just had this same issue and fixed it by formatting the path to each image like this: data/local-files/?d=home/path/to/image.jpg .

yazansayed commented 1 year ago

I just want to express disappointment for handling such a necessary task as loading data from a local json, wasted a lot of time because of this

Reasat commented 1 year ago

The above answers seems to have a lot of confusion. Is there a simple answer to the question, "How to add images from a local directory?"

frippe75 commented 10 months ago

Ahh.. Just got back to trying to eval label studio after leaving it the last time since it consumed ALL my time set a side for a project rather than moving the project forward focusing on the "value-add". This should be the main focus on any type of tool. Back again with some fresh energy. Stuck again trying to run locally using docker and local files. This time I'm restricting my eval time to 10 hours, soon to be out-of-them...

I restorted to using custom jupyter notebook and winging my own. Dataset quickly grew to 50.000 labeled images.

Now I'm stuck with 15 COCO dataset files. Used the label-studio-converter and tried multiple --image-root-url way of trying to get them to match but getting 403 and that I cannot even read the /data/local-files directory... There is an old issue where people use a chown -R during startup...

I think there should be a dead-simple way of getting up and running. Tried a conda env and pip install ... Never completes successfully and takes for ever. Label Studio looks like a great option and I would really like to get started using it.

unfa commented 10 months ago

Let me be another one to say: I am very excited about Label Studio and creating my first dataset and CV model, but I spent hours trying to add my locally stored images and it seems like everyone says something different, and the docs make it look rather easy, but I get errors constantly and can't progress though this. Do I really need to put my local data into an S3 bucket to get going?

  1. I installed via pip in a dedicated Python venv;
  2. I exported the designated paths to environment before running label-studio;
  3. I run the helper web server as documented, but it never adds anything to the files.txt file. It's empty. Trying to specify a wildcard also produces errors from find command about missing or wrong parameters. I have worked this around by grabbing the HTML page source and making it into a list of local server's URLs to images that work (I can open them in my browser). Saved the list to a different .txt file;
  4. I selected the new .txt file at import and said to treat it as multiple tasks;
  5. It seems to import ok, I can see thumbnails of my pictures, but when I select them for labeling - the picture sometimes shows for a split second and it disappears and I see an error message instead: image

On the other side I see the local web server providing the files without errors:

Running web server on the port 9002
Serving HTTP on 0.0.0.0 port 9002 (http://0.0.0.0:9002/) ...
127.0.0.1 - - [13/Nov/2023 21:34:08] "GET /IMG-20221223-WA0001.jpeg HTTP/1.1" 200 -
127.0.0.1 - - [13/Nov/2023 21:34:47] "GET /IMG-20221203-WA0008.jpeg HTTP/1.1" 200 -
127.0.0.1 - - [13/Nov/2023 21:34:49] "GET /IMG-20221213-WA0006.jpeg HTTP/1.1" 200 -
127.0.0.1 - - [13/Nov/2023 21:34:50] "GET /IMG-20221126-WA0014.jpeg HTTP/1.1" 200 -

The label-studio executable also doens't print any errors in the console:

[2023-11-13 20:34:49,206] [django.server::log_message::161] [INFO] "GET /api/tasks/6?project=1 HTTP/1.1" 200 730
[2023-11-13 20:34:49,206] [django.server::log_message::161] [INFO] "GET /api/tasks/6?project=1 HTTP/1.1" 200 730
[2023-11-13 20:34:49,228] [django.server::log_message::161] [INFO] "GET /api/label_links?project=1&expand=label HTTP/1.1" 200 52
[2023-11-13 20:34:49,228] [django.server::log_message::161] [INFO] "GET /api/label_links?project=1&expand=label HTTP/1.1" 200 52
[2023-11-13 20:34:50,203] [django.server::log_message::161] [INFO] "GET /api/tasks/1?project=1 HTTP/1.1" 200 730
[2023-11-13 20:34:50,203] [django.server::log_message::161] [INFO] "GET /api/tasks/1?project=1 HTTP/1.1" 200 730
[2023-11-13 20:34:50,231] [django.server::log_message::161] [INFO] "GET /api/label_links?project=1&expand=label HTTP/1.1" 200 52
[2023-11-13 20:34:50,231] [django.server::log_message::161] [INFO] "GET /api/label_links?project=1&expand=label HTTP/1.1" 200 52

Maybe this here issue could be re-opened and renamed to a more general "Issues with using local images"?


EDIT (an hour later)

I have somehow managed to get it working. I have made sure my "local media root" path is one step above the place where I actually store photos, and then adding the local storage I added the longer absolute path and it worked!

I now first tried to import via a URL of an image, but that was one-at-a-time. I've found that I can now "upload" the image files directly from the path and that... worked?

I was able to label by (small) dataset of images! Woo!

AlcobaAI commented 10 months ago

Hi, This is how I got it to work, hope it helps :). The example is from what I was working on at the time.

  1. Create a project and go to settings -> Cloud Storage: Add the location of the images and the output. Do not "Treat every bucket object as a source file" or sync. And check the connection.

image

  1. Label Interface:
<View>
  <Image name="image" value="$filename" zoom="true" zoomControl="true" maxWidth="100%"/>
  <Header size="10" value="$source_en"/>
  <Header size="10" value="$caption"/>

  <View>
  <Text name="q1" value=""/>
   <Header size="10" value=" Is the translation good? If so answer "Yes", if not write an alternate translation. />
    <TextArea name="answer1" toName="q1" rows="5" maxSubmissions="1"/>
  </View>
</View>
  1. Here I had a separate CSV file with "filename" and 2 other columns "source_en" and "caption". In the filename add each separate path as "data/local-files/?d=/home/pathtoimages/image_name1.jpg". And import the CSV.

image

SCZwangxiao commented 9 months ago

Here is the detailed explanation: https://labelstud.io/guide/storage.html#Local-storage

Trevol commented 4 days ago

I also searched for simple option to say Label Studio to take all images in specified directory... But managed to import images from local storage in following way:

  1. My Label Studio runs as docker container. So my local images directory mounted as volumes and necessary environment was specified. Snippet from docker-compose.yml ` environment:

    • LABEL_STUDIO_LOCAL_FILES_SERVING_ENABLED=true
    • LABEL_STUDIO_LOCAL_FILES_DOCUMENT_ROOT=/label-studio/data/images
      volumes:
    • "./label-studio-data:/label-studio/data"
    • "./images:/label-studio/data/images"`
  2. Add Source Cloud Storage with type "Local files" and path /label-studio/data/images/pylons. Subdirectory pylons contains image files for annotation (object detection template). Now individual images become accessible via url {LABEL_STUDIO_HOST}/data/local-files?d=pylons/{IMAGE_FILE} (in may case http://localhost:8080/data/local-files?d=pylons/1.jpeg)

  3. Created (and then imported to Data Manager) single text file (pylons.txt in my case) with urls to local images:
    pylons_sample.txt Screenshot from 2024-09-24 17-10-25