web-arena-x / webarena

Code repo for "WebArena: A Realistic Web Environment for Building Autonomous Agents"
https://webarena.dev
Apache License 2.0
631 stars 89 forks source link

Links on shopping site return 404 errors #152

Open brandontrabucco opened 1 week ago

brandontrabucco commented 1 week ago

Hello Web Arena team,

On several pages, there are a set of links below products that return 404 errors when clicked on.

image

This image shows an example, and the links have the following format:

http://localhost:7770/dp/***/ref=***

I don't have an exhaustive list of the affected products, but several that I've personally tested lead to 404 errors when clicking on links matching the above format, including links on these pages:

http://localhost:7770/dayton-audio-t652-dual-6-1-2-2-way-tower-speaker-pair.html http://localhost:7770/milano-s-100-imported-romano-cheese-jar-16-ounce.html http://localhost:7770/bonzy-home-glossy-led-tv-stand-black-tv-stand-with-led-rgb-lights-wood-media-storage-console-for-65-inch-tv-flat-screen-tv-cabinet-gaming-consoles-in-lounge-room-living-room-and-bedroom-black.html

This issue seems to exist for every product that has a comparison table in the product details.

Is this intended behavior, or could it be a bug in my instance of the site?

Thanks for the help! -Brandon

brandontrabucco commented 1 week ago

It appears many links of the form /stores/page/*** are also broken in my environment. Such as:

http://localhost:7770/stores/page/3FA15183-633E-472A-BE62-8725A0D9821D?store_ref=storeRecs_dp_aplus

On

http://localhost:7770/crestlive-products-dresser-storage-drawer-organizer-fabric-dresser-for-bedroom-living-room-entryway-closets-easy-pull-fabric-bins-wood-top-mixed-color.html

shuyanzhou commented 7 hours ago

Hi @brandontrabucco, great catch! I think this is expected since we imported the raw product description pages from Amazon without considering inter-page links. Does this behavior influence any task completions?