Closed RichardLitt closed 8 years ago
Is there a parse error with issues like "3. 404 https://en.wikipedia.org/wiki/Open_source)"? There is a ")" in the URL which shouldn't be there and the link actually resolves in the document.
@dkhamsing what do you think about that issue?
There was a parsing issue with a previous version .. the current version finds these results
> Links
1. 302 https://github.com/RichardLitt/endangered-languages/edit/master/README.md
2. 302 http://wesay.org
3. 301 http://cdec-decoder.org/
4. 301 http://goo.gl/wdnz1W
5. 404 http://hunspell.cvs.sourceforge.net/hunspell
6. 301 http://nltk.github.com/
7. 503 http://www.onlinelinguisticdatabase.org
8. 404 http://dev.panlex.org/tools/
9. 301 http://ilk.uvt.nl/timbl/
10. 302 http://www.wavesurfer.fm
11. https://dative.lingsync.org/ hostname "dative.lingsync.org" does not match the server certificate
12. https://lexicondev.lingsync.org/ SSL_connect returned=1 errno=0 state=SSLv3 read server certificate B: certificate verify failed
13. https://lexicondev.lingsync.org/analysisbytierbyword/inuktitut/nunaqjuaqli%20aaqkiksimalaunngilaq%20sunataqaranilu%20itijuqjuamik%20taaqtualuulluni%20guutiullu%20anirngninga%20ingirralauqpuq%20imaaluup%20qulaagut SSL_connect returned=1 errno=0 state=SSLv3 read server certificate B: certificate verify failed
14. 404 http://www.ark.cs.cmu.edu/TurboParser/nasmith_models/kin-turbo-v1.0.tgz
15. 404 https://github.com/FieldDB/migmaqLessons
16. 404 https://github.com/cidles/mindericobot
17. 404 http://gielese.no
> Dupes
1. https://github.com/sindresorhus/awesome
2. https://wiki.mozilla.org/B2G
3. https://github.com/nltk/nltk
4. https://github.com/FieldDB
5. https://img.shields.io/github/stars/LowResourceLanguages/hltdi-morphology.svg
6. https://github.com/LowResourceLanguages/hltdi-morphology
Actually with your configuration, the current results are
awesome_bot README.md --white-list https://github.com/sindresorhus/awesome,https://github.com/FieldDB,https://img.shields.io/github/stars/LowResourceLanguages/hltdi-morphology.svg,https://github.com/LowResourceLanguages/hltdi-morphology
https://travis-ci.org/RichardLitt/endangered-languages/builds/99912041
> Links
1. 302 https://github.com/RichardLitt/endangered-languages/edit/master/README.md
2. 302 http://wesay.org
3. 301 http://cdec-decoder.org/
4. 301 http://goo.gl/wdnz1W
5. 404 http://hunspell.cvs.sourceforge.net/hunspell
6. https://victorio.uit.no/langtech/trunk/ SSL_connect returned=1 errno=0 state=SSLv3 read server certificate B: certificate verify failed
7. 301 http://nltk.github.com/
8. 404 http://dev.panlex.org/tools/
9. 503 http://www.onlinelinguisticdatabase.org
10. 301 http://ilk.uvt.nl/timbl/
11. 302 http://www.wavesurfer.fm
12. https://dative.lingsync.org/ hostname "dative.lingsync.org" does not match the server certificate
13. https://lexicondev.lingsync.org/ SSL_connect returned=1 errno=0 state=SSLv3 read server certificate B: certificate verify failed
14. https://lexicondev.lingsync.org/analysisbytierbyword/inuktitut/nunaqjuaqli%20aaqkiksimalaunngilaq%20sunataqaranilu%20itijuqjuamik%20taaqtualuulluni%20guutiullu%20anirngninga%20ingirralauqpuq%20imaaluup%20qulaagut SSL_connect returned=1 errno=0 state=SSLv3 read server certificate B: certificate verify failed
15. 404 http://www.ark.cs.cmu.edu/TurboParser/nasmith_models/kin-turbo-v1.0.tgz
16. 404 https://github.com/cidles/mindericobot
17. 404 http://gielese.no
18. http://qaamuus.so/ getaddrinfo: Name or service not known
> Dupes
1. https://wiki.mozilla.org/B2G
2. https://github.com/nltk/nltk
let me know if you have any questions
@dkhamsing Cool, glad that that bug was fixed.
I'm curious about why [https://wiki.mozilla.org/B2G](https://wiki.mozilla.org/B2G)
is counted as a dupe. That seems to me to be a pretty simple example in markdown of where that shouldn't be flagged.
Uh yea the script is not able to distinguish that case .. Can you change it in the readme?
Done. Will attack the other issues later. This shouldn't be marked as a duplicate, though. An easy check would see if there is a surrounding [...](...)
around the links themselves.
Current issues:
Issues :-(
> Links
1. 302 https://github.com/RichardLitt/endangered-languages/edit/master/README.md
2. 302 http://wesay.org
3. 301 http://cdec-decoder.org/
4. 404 http://hunspell.cvs.sourceforge.net/hunspell
5. https://victorio.uit.no/langtech/trunk/ SSL_connect returned=1 errno=0 state=SSLv3 read server certificate B: certificate verify failed
6. 404 http://dev.panlex.org/tools/
7. 301 http://ilk.uvt.nl/timbl/
8. 302 http://www.wavesurfer.fm
9. https://dative.lingsync.org/ hostname "dative.lingsync.org" does not match the server certificate
10. https://lexicondev.lingsync.org/analysisbytierbyword/inuktitut/nunaqjuaqli%20aaqkiksimalaunngilaq%20sunataqaranilu%20itijuqjuamik%20taaqtualuulluni%20guutiullu%20anirngninga%20ingirralauqpuq%20imaaluup%20qulaagut SSL_connect returned=1 errno=0 state=SSLv3 read server certificate B: certificate verify failed
11. https://lexicondev.lingsync.org/ SSL_connect returned=1 errno=0 state=SSLv3 read server certificate B: certificate verify failed
12. 404 http://www.ark.cs.cmu.edu/TurboParser/nasmith_models/kin-turbo-v1.0.tgz
13. 404 https://github.com/cidles/mindericobot
14. http://qaamuus.so/ getaddrinfo: Name or service not known
15. 404 http://gielese.no
> Dupes
None ✓
@RichardLitt thanks for the feedback
It appears that most of those links in the most recent (jan 3) combing are legit. Should this issue be closed or have another travis run to check for more broken links?
You're right!
Ran travis to check links. Ran into these issues: