zsaralin / DOM-tree-data

Uses DOM of 100 webpages to generate structural priors to improve the vision-based segmentation of web pages
3 stars 1 forks source link

How to get new_data? #1

Open Babowings opened 1 year ago

Babowings commented 1 year ago

I'm new in web page segmentation.I'm very confused about how you converted HTML DOM files to the format of files in new_data? Because my HTML DOM file is very messy, all the elements are stacked and mixed together.

zsaralin commented 1 year ago

Hello!

I used a script that runs as an extension on Firefox. The script converts the DOM to a string representation. Unfortunately, it was written by an old professor so I'm unable to share it with you but I hope that helps!

Saralin

On Sun, Sep 17, 2023 at 8:51 AM Babowings @.***> wrote:

I'm new in web page segmentation.I'm very confused about how you converted HTML DOM files to the format of files in new_data? Because my HTML DOM file is very messy, all the elements are stacked and mixed together.

— Reply to this email directly, view it on GitHub https://github.com/zsaralin/DOM-tree-data/issues/1, or unsubscribe https://github.com/notifications/unsubscribe-auth/ANRMYIFE2TBMN365KJXJ4Q3X23573ANCNFSM6AAAAAA43UHSZU . You are receiving this because you are subscribed to this thread.Message ID: @.***>