liip / TheA11yMachine

The A11y Machine is an automated accessibility testing tool which crawls and tests pages of any web application to produce detailed reports.
https://www.liip.ch/
621 stars 66 forks source link

opentransportdata.swiss is not recursively crawled #78

Closed fanderegg closed 7 years ago

fanderegg commented 7 years ago

The website https://opentransportdata.swiss/ is not recursively crawled. Only the links of the language switcher are recognized and crawled, but after that is stops. Command used:

a11ym https://opentransportdata.swiss/ -o out

The website consists of two different softwares (CKAN and WordPress), and it looks like the crawler does not like the wordpress pages (e.g. https://opentransportdata.swiss/de/cookbook/ or https://opentransportdata.swiss/de/) but correctly crawls ckan pages (e.g. https://opentransportdata.swiss/de/dataset/)

Hywan commented 7 years ago

Hello,

The DOM is broken. However, the crawler uses regular expression, so I don't understand why it does not work yet. Searching. You might want to fix the DOM though ;-).

Hywan commented 7 years ago

Finally, I found what is the problem. Sometimes, we encounter URL like /en/log-in, and sometimes /en/log-in/. When having crawling /en/log-in, the server responds with a 301 redirection to /en/log-in/, and the crawler does not like it.

I am working on a fix.

Hywan commented 7 years ago

Much better isn't it :-)?

screen shot 2016-12-21 at 12 06 23-fullpage

(run with ./a11ym https://opentransportdata.swiss/en).