NoviScl / Design2Code

MIT License
445 stars 34 forks source link

How to preprocess websight dataset? #21

Closed shipengai closed 6 months ago

shipengai commented 7 months ago

In the Websight dataset, the HTML code contains an image URL. Should the link be replaced?

StevenyzZhang commented 6 months ago

We are using Websight v0.1 to train the model, which does not contain an image URL most of the time; we know that Websight v0.2 contains an image URL that allows it to search images related to the website.

You can replace them with the place holder image name and rerender the screenshot before training your model. This will makes your model generate placeholder image by default.

You can also keep those URL and replace them after generating predictions, before calculating the metrics of our benchmark.