Initial Implementaion of Spritesheet/Dataset generator for the Development Workbench.

akhil-rana commented 4 years ago

This works with caMicroscope #409

Description

This helps in creation of dataset from the caMicroscope labelling files.
It extracts and organises the files present in multiple zip files so that the spritesheet generation can be performed.
It places the images in the following fashion: ./workbench-utils/dataset/<labelname>/<corresponding images>
The ./workbench-utils/dataset/ folder is archived into a zip file which is returned to the user after the spritesheet and labelling files have been generated.
Dockerfile has been modified to install python and dependencies(Pillow, numpy etc.) required to run ./workbench-utils/spritemaker.py which is responsible for spritesheet generation.
All the files including zips, datasets are automatically deleted when the task is done.

Motivation and Context

Since spritesheet generation can be a quite heavy task because of direct pixel-level processing of many images so this task has to performed on the server-side.

How Has This Been Tested?

Briefly tested on Firefox (78.0), Chrome (84.0) on Windows 10 and Linux (arch)

Types of changes

[ ] Bug fix (non-breaking change which fixes an issue)
[x] New feature (non-breaking change which adds functionality)
[ ] Breaking change (fix or feature that would cause existing functionality to not work as expected)

Checklist:

[x] My code follows the code style of this project.
[ ] My change requires a change to the documentation.
[ ] I have updated the documentation accordingly.

birm commented 4 years ago

I'm testing now, but I'm still not sure how I feel about python in the caracal container. @nanli-emory thoughts?

nanli-emory commented 4 years ago

@birm I think it depends on what your perspective for Caracal. For now, I think it is fine. But when all ML part business growth I think we can create a new container for ML core.

birm commented 4 years ago

I'm getting some errors when trying to use it. I used the CMU3 sample image, and created ~ 5 "a" labels and 5 "b" labels, and tried to use that. The console returns

ca-back     | [Error: ENOENT: no such file or directory, access './workbench-utils/dataset.zip'] {
ca-back     |   errno: -2,
ca-back     |   code: 'ENOENT',
ca-back     |   syscall: 'access',
ca-back     |   path: './workbench-utils/dataset.zip'
ca-back     | }
ca-back     | POST /workbench/deleteDataset 200 1.116 ms - 24
ca-back     | Extracted
ca-back     | Unhandled rejection Error: File does not exist. Check to make sure the file path to your csv is correct.
ca-back     |     at /root/src/node_modules/csvtojson/v2/Converter.js:81:37
ca-back     |     at suppressedCallback (fs.js:210:5)
ca-back     |     at FSReqCallback.oncomplete (fs.js:154:23)
ca-back     | Extraction complete

akhil-rana commented 4 years ago

Okay. It is not able to locate the patches.csv file. I'm not sure why that is happening. It worked fine for me in all the cases. I might have missed something. I'm into it.

akhil-rana commented 4 years ago

Okay it might be checking for the csv file even before it's extraction. I'll try to solve this.

akhil-rana commented 4 years ago

I'm getting some errors when trying to use it. I used the CMU3 sample image, and created ~ 5 "a" labels and 5 "b" labels, and tried to use that. The console returns

ca-back     | [Error: ENOENT: no such file or directory, access './workbench-utils/dataset.zip'] {
ca-back     |   errno: -2,
ca-back     |   code: 'ENOENT',
ca-back     |   syscall: 'access',
ca-back     |   path: './workbench-utils/dataset.zip'
ca-back     | }
ca-back     | POST /workbench/deleteDataset 200 1.116 ms - 24
ca-back     | Extracted
ca-back     | Unhandled rejection Error: File does not exist. Check to make sure the file path to your csv is correct.
ca-back     |     at /root/src/node_modules/csvtojson/v2/Converter.js:81:37
ca-back     |     at suppressedCallback (fs.js:210:5)
ca-back     |     at FSReqCallback.oncomplete (fs.js:154:23)
ca-back     | Extraction complete

@birm You are getting this everytime you run the program? Or randomly? also are the files you using too big? Because I'm not able to reproduce this unhandled rejection error.

birm commented 4 years ago

I'll recommend moving python code to the slideloader service (or a new service in anticipation of more ml utilities), in any case. Let me know if you need either created.

akhil-rana commented 4 years ago

I don't think I'll need python for anything else other than spritesheet generation. Though I might need nodejs for more utilities in future. Do you suggest making a new nodejs container for ML services?

akhil-rana commented 4 years ago

Or for now I can try and move the python code to slideloader and make changes here to reflect the same. And if I need more ML services in the future I'll let you know.

akhil-rana commented 4 years ago

@birm I've implemented the spritesheet generator in Slideloader #32. I think we can close this now if you say so.

camicroscope / Caracal