wummel / linkchecker

check links in web documents or full websites
http://wummel.github.io/linkchecker/
GNU General Public License v2.0
1.42k stars 234 forks source link

Linkchecker doesn't run with a file, but runs for a specific URL #599

Closed mariashakhnovich closed 9 years ago

mariashakhnovich commented 9 years ago

Running the following linkchecker --config confi_file sites_file where sites_file is a line-separated list of URLs to check like: www.site1.com www.site2.com

results in: INFO 2015-05-22 13:10:14,838 MainThread Checking intern URLs only; use --check-extern to check extern URLs. LinkChecker 9.3 Copyright (C) 2000-2014 Bastian Kleineidam LinkChecker comes with ABSOLUTELY NO WARRANTY! This is free software, and you are welcome to redistribute it under certain conditions. Look at the file `LICENSE' within this distribution. Get the newest version at http://wummel.github.io/linkchecker/ Write comments and bugs to https://github.com/wummel/linkchecker/issues Support this project at http://wummel.github.io/linkchecker/donations.html

Start checking at 2015-05-22 13:10:14-007

Statistics: Downloaded: 0B. Content types: 0 image, 0 text, 0 video, 0 audio, 1 application, 0 mail and 0 other. URL lengths: min=78, max=78, avg=78.

That's it. 1 link in 1 URL checked. 0 warnings found. 0 errors found. Stopped checking at 2015-05-22 13:10:14-007 (0.01 seconds)

However, running it like: linkchecker --config config_file www.site1.com runs as it should, with result: INFO 2015-05-22 13:08:28,141 MainThread Checking intern URLs only; use --check-extern to check extern URLs. LinkChecker 9.3 Copyright (C) 2000-2014 Bastian Kleineidam LinkChecker comes with ABSOLUTELY NO WARRANTY! This is free software, and you are welcome to redistribute it under certain conditions. Look at the file `LICENSE' within this distribution. Get the newest version at http://wummel.github.io/linkchecker/ Write comments and bugs to https://github.com/wummel/linkchecker/issues Support this project at http://wummel.github.io/linkchecker/donations.html

Start checking at 2015-05-22 13:08:28-007 1 thread active, 0 links queued, 0 links in 0 URLs checked, runtime 1 seconds 10 threads active, 536 links queued, 1151 links in 18 URLs checked, runtime 6 seconds 10 threads active, 612 links queued, 3006 links in 24 URLs checked, runtime 11 seconds 10 threads active, 743 links queued, 4250 links in 29 URLs checked, runtime 16 seconds 10 threads active, 707 links queued, 5357 links in 35 URLs checked, runtime 21 seconds 10 threads active, 659 links queued, 5405 links in 40 URLs checked, runtime 26 seconds 10 threads active, 617 links queued, 5447 links in 46 URLs checked, runtime 31 seconds 10 threads active, 484 links queued, 5580 links in 53 URLs checked, runtime 36 seconds 10 threads active, 404 links queued, 5660 links in 65 URLs checked, runtime 41 seconds 10 threads active, 157 links queued, 5907 links in 77 URLs checked, runtime 46 seconds

Statistics: Downloaded: 2.45MB. Content types: 235 image, 3629 text, 0 video, 0 audio, 564 application, 7 mail and 1639 other. URL lengths: min=12, max=176, avg=46.

That's it. 6074 links in 78 URLs checked. 0 warnings found. 0 errors found. Stopped checking at 2015-05-22 13:09:19-007 (50 seconds)

Note: I was able to run the file of URLs without issues before, but recently upgraded the linkchecker to 9.3 (not sure what version was before)

Are there any limitations for the files that can be used? Specific format? Thanks!

mariashakhnovich commented 9 years ago

I don't know why I used to run it like this before, and why now it's now allowing me to, but this is the solution to my problem: instead of linkchecker --config confi_file sites_file do cat sites_file | linkchecker --config conf_file --stdin