Lyleregenwetter / BIKED

A Dataset and Machine Learning Benchmarks for Data-Driven Bicycle Design
MIT License
29 stars 5 forks source link

Generating & Rendering bcad files #3

Closed LeonardFritz closed 1 year ago

LeonardFritz commented 1 year ago

Hey Lyle,

First off - thanks for publishing your code! Anyways I was wondering how you rendered the bcad files? I've been experimenting with a few models and it'd be nice if I could visualize the generated bikes somehow. Currently I have to manually upload the files to bikecad.ca and wait until the bike loads, which is super slow. Second question - in the genBCAD function instead of writing a bcad file from scratch you first open a template file and manipulate it, but why do you do it like that? The template file has over 6000 entries and you only edit those who are also in the generated dataframe, meaning there are ca. 5000 entries that are always the same - no? Also when I tried generating a bcad file with your method it wouldn't open on the website, not sure why. I had started writing my own method (before I realized you already implemented it), where I generated a new file from scratch and those files would open on the website, though there were some issues like for example boolean values were represented as 0 and 1. So yea my question is - why do you base your genBCAD method on a template file and why does the output not work (for me)?

Again thanks a lot for publishing your code! -- Leonard

Lyleregenwetter commented 1 year ago

Hey Leonard,

Great questions! To answer your first one, I have a custom script to generate images from BikeCAD files. I basically hand-modified the bikecad source code to create an executable version which could be parallelized to render a bunch of images. The reason why I haven't made this public is because it is a modified version of the proprietary bikeCAD software (to which I have paid for a license). BikeCAD is made by a solo developer (Brent) who I have collaborated with in the past and I obviously don't want to publish his IP without his consent or jeopardize his business. However I could inquire if he is comfortable releasing such an executable version himself or letting me release my version.

Regarding your second question: You're absolutely right that any parameters that dont appear in your dataframe will be left at their "default" values from the template file. I have several versions of the dataset with different numbers of parameters, with fewer parameters generally being less expressive. For this reason, you may notice that "regenerated" images differ slightly from the images in the dataset. For the standard version of the dataset, most of the exclusions are parameters related to the current state of the BikeCAD GUI (locations of dimension labels, etc). Notably, the component colors are omitted in that version of the dataset.

I'm not exactly sure why your bikeCAD files aren't generating correctly. If you are using a one-hot encoded version of the dataset, you need to pass in the from_oh=True flag to the processgen function. Maybe this is the problem? Will probably need more details to debug this.

Regards, Lyle

LeonardFritz commented 1 year ago

Hey Lyle,

thanks for the quick reply! I completely understand why you wouldn't want to publish your modded BikeCAD renderer, I guess I'll just have to use the website.

I'll append my notebook where I've been testing out the dataset. At the end you'll see two cells, in the first one I just copied your deOH and genBCAD functions and in the second cell I wrote my own genBCAD function that generates a new bcad file without a template. My problem is that neither cells produce bcad files that I can open using the web editor. I don't get an Error like 'corrupted file' or anything, it just never loads.

If you have some spare time I'd greatly appreciate if you could give me some pointers! -- Leonard

VAE BIKED Test.zip

Lyleregenwetter commented 1 year ago

Sure, I had a look through your notebook. I don't immediately see any issues. Before I have a deep dive, I was wondering if you may be able to share a couple generated bcad files that you made using both methods. I can see if I can successfully generate images using my script or the bikecad app so we can determine if its a file generation issue or a website issue.

Also, there is a file called processGen with a function of the same name. If your generated data is in the same format as the dataset, I wonder if you could try just running that. It has all the processing steps stuck together.

LeonardFritz commented 1 year ago

Hey Lyle,

Sure I'll attach some example files! Thanks for your help. In the attached archive you'll find 20 files, 10 generated using your genBCAD function (0.bcad to 9.bcad), the other 10 using my version (gen_sample_0.bcad to gen_sample_9.bcad).

-- Leonard

example_BCAD_files.zip

Lyleregenwetter commented 1 year ago

Hey Leonard,

Sorry for the slow response here... I haven't really been able to identify the issue. I can open both yours and my versions in bikecad, but they are kind of messed up... some screenshots:

My version: image Your version: image

The files also seem laggy in the software for some reason. I suspect that there are some weird values in the files that are making some unresolvable geometry somewhere but I havent pinpointed what it is. In fact, most of the values that I checked seem quite reasonable. I'm assuming these are VAE-generated samples? Perhaps you can do a couple things to help us debug here. First: can you try taking a couple of the original bikes from the dataset and regenerating bikecad files from them? Then see if they open for you. Second: to help me understand what might be going wrong: Which parameters is your model generating? And which are you leaving at default values?

Also, all of the bikes seem to have this same structure, so it may be that your generative model is just stuck in some region of the design space where the geometry is invalid. It's not uncommon for generative models to collapse in some off-distribution area that may be geometrically invalid, but this would be more typical for a GAN variant, I'd imagine.

Anyways, sorry for a lack of a concrete solution, but hope this is a step in the right direction.

Lyle

LeonardFritz commented 1 year ago

Hey Lyle,

So I did some more testing yesterday and through the browser console I found out the reason that nothing happens when I try to open files on the website is that eventually it will throw this error:

07:56:33.301 XHRGET https://www.bikecad.ca/uBCADfiles/1.bcad [HTTP/1.1 200 OK 0ms]

07:56:33.482 java.lang.IllegalArgumentException: setSelectedIndex: -16 out of bounds cheerpOS.js:1772:11

07:56:33.485 at javax.swing.JComboBox.setSelectedIndex(Unknown Source) cheerpOS.js:1772:11

07:56:33.487 at basic.DecalPanel.propRead(Unknown Source) cheerpOS.js:1772:11

07:56:33.490 at basic.paintSchemes.propRead(Unknown Source) cheerpOS.js:1772:11

07:56:33.493 at basic.bikeCADPro.ReadIn(Unknown Source) cheerpOS.js:1772:11

07:56:33.496 at basic.bikeCADPro.PropOpen(Unknown Source) cheerpOS.js:1772:11

07:56:33.500 at basic.bikeCADPro.PropOpenFirst(Unknown Source) cheerpOS.js:1772:11

07:56:33.505 at basic.bikeCADPro.init(Unknown Source) cheerpOS.js:1772:11

07:56:33.507 at sun.applet.AppletPanel.run(Unknown Source) cheerpOS.js:1772:11

07:56:33.509 at java.lang.Thread.run(Unknown Source) cheerpOS.js:1772:11

Next I tried recreating a model from the dataset, without involving the VAE, just running it through the XML conversion, and that worked. So it seems the VAE was just generating invalid models, and not the genBCAD or to_xml functions producing invalid bcad files. I've tried training a few different models and I've gotten some valid files that actually load, but none of the outputs visually resemble bikes, the geometry is still messed up.

Edit: Oh and to answer your question I just had the VAE generate all ~2400 parameters present in the standard dataset, I'm not very experienced with working with VAEs (or generative AI in general), so I don't know if that's the most sensible approach or not.

Anyways thanks so much for your help & patience! -- Leonard

Lyleregenwetter commented 1 year ago

Hi Leonard,

Great! Glad we were able to get to the bottom of this. Yeah, in my experience generative models really struggle with this dataset (it's part of a larger trend of generative models which excel in vision/NLP struggling to observe geometric/physical constraints in design problems). This is the subject of much of my thesis work & an upcoming paper looking at training generative models with both valid & invalid datapoints to help it avoid invalid regions of the space. In general, if you are struggling with precision, I'd try training a very small model, which will probably not capture all the variety in the dataset, but should hopefully get the `average' bike right.

Generally, if you constrain the model to only learn a subset of the parameters, it can give it fewer opportunities to "mess up," but then the representative power of the model is diminished. I've tried a few different models on the full ~2400 parameter dataset, including the CTGAN and TVAE from this paper. They were generally able to create valid bikes, but they still looked pretty weird. Eventually I kind of just decided to use various smaller subsets of the parameter space.

Anyways, hope this insight is valuable! -Lyle

Lyleregenwetter commented 1 year ago

Going to close this issue, but feel free to keep commenting in the thread or get in touch if you want to chat about generative AI.