Lyleregenwetter / BIKED

A Dataset and Machine Learning Benchmarks for Data-Driven Bicycle Design
MIT License
29 stars 5 forks source link

Match bike images with reduced parametric data #1

Closed lebeli closed 1 year ago

lebeli commented 1 year ago

Hello,

I'd like to use the bike images (folder: 'Segmented bike images/bikes/') along with some data from BIKED_reduced.csv. How can I match the images with the respective samples from the BIKED_reduced.csv? Can I match them by using the image names and the column 'Unnamed: 0' (which is some kind of index i suppose)? E.g.: 'bike (10).jpg' with data from the row where 'Unnamed: 0' is 10?. Is this assumption correct? Because there are ~250 images which names don't correspond to the values in column 'Unnamed: 0'.

I am also asking, because of your warning 'Warning: Images and processed parametric data do not contain the exact same set of models'.

Kind regards.

Lyleregenwetter commented 1 year ago

That's correct, The indices currently don't correspond to the parametric data. Unfortunately, I don't have the mapping from the segmented image labels to the indices. This happened due to a bit of an oversight because we generated the segmented images with a different pipeline. I'll circle back to this in a couple days and regenerate the segmented images under the same labeling scheme and update the dataset.

lebeli commented 1 year ago

Thank you, much appreciated!

lebeli commented 1 year ago

Hello @Lyleregenwetter any updates yet?

Edit: Any chance that the processing pipelines could be released?

Lyleregenwetter commented 1 year ago

Yes, I have rewritten the code and am currently generating the images (approx 3000 of 4500 done). This should be finished by next week. Unfortunately I won't be able to release the pipeline at this time since an integral step is a modified version of the proprietary BikeCAD software. The best I can do is provide the code that generates the BikeCAD files from which the images are later derived.

lebeli commented 1 year ago

Could you give an update when the image generation is finished?

Lyleregenwetter commented 1 year ago

Hi @lebeli,

Apologies for the dalay. I have updated the Dropbox with a new zip file, entitled Bikedv2. This contains svg and png files of the full bikes (all_XXXX.png) and components (component_XXXX).png. It is formatted slightly differently though. As requested, the numbers now correspond to the tabular data in BIKED (tabular data is unchanged from v1). Nonexistent components are simply included as a blank image. Any bike or component image that failed processing is omitted.

I am hoping the new format is just as user-friendly, but I am happy to reorganize things if you'd prefer. Please let me know if I can help with anything else. Here's the link.

https://www.dropbox.com/home/decode_lab/Datasets/Public%20Documents/BIKED%20%26%20FRAMED%20Datasets

lebeli commented 1 year ago

Thank you very much! In the old dataset there was a folder \Segmented bike images\bike with rectangular bike images. Any chance you have those, too? If not, how exactly did you generate these (positioning of the bike etc.)?

Lyleregenwetter commented 1 year ago

I did not generate these square images this time. Instead the all_XXXX bikes are the equivalent, content-wise. I believe you should be able to achieve the rectangular images through some cropping/resizing. I left them "raw" to give users more flexibility (and included the svgs in case users want to mess with the png conversion), however I am happy to do this cropping/rescaling if you'd like, since I can see that it may be convenient for people looking to quickly train CNN models.

Some notes:

lebeli commented 1 year ago

Thank you very much for the insights of the process. I would appreciate it very much, if you could provide the rescaled images as well, since I can make sure that I can compare the results with the training (without labels) using your provided images.

Lyleregenwetter commented 1 year ago

I'm not sure how you wanted the images rescaled, but I created some square 1024x1024 images which are in the same folder. Hope this is what you needed.

lebeli commented 1 year ago

I noticed, that in the square images the bikes are positioned in the bottom left corner, whereas in the original dataset, the bikes are centered in the square images. Would it be possible to provide images with centered bikes?

Lyleregenwetter commented 1 year ago

I can artificially position the bike closer to the center, but there will be a higher chance it clips the edge of the image. I can also re-scale the images so they fit in the center, but you will lose the 1:1 correspondence from pixel distance to real-world distance. It's hard for me to guess exactly what you need for your application, so I think it may be best for you to modify the images as you see fit.

lebeli commented 1 year ago

So you would say for the purpose of training a CNN your given format is (best) suited?

Edit: I would like to train a GAN with your bike images.

Lyleregenwetter commented 1 year ago

I see, for a GAN, It would probably be best to rescale to the size of the bike so that the bike fills up the entire image

lebeli commented 1 year ago

Filling up the entire image, i.e. both tires touch the side edge of the image as well as the bottom edge? And vertically the image is expanded (up) to fit a square?

Lyleregenwetter commented 1 year ago

I would have both tires touch the side edges, position the image at the bottom, and let it rise up a proportional distance in the image, to avoid distorting it.

lebeli commented 1 year ago

I have generated an example image, is that how you would do it?

all_1058

Lyleregenwetter commented 1 year ago

Yeah, I think that looks great!

lebeli commented 1 year ago

This issue can be closed, thank you very much for your help!

Lyleregenwetter commented 1 year ago

Thanks for your patience!