Closed retorquere closed 2 years ago
Hey @retorquere you can run htmltest -l0
to log every file as it's tested.
I'm an idiot. Of course I wanted level 0, sorry.
The offending page is at https://gist.github.com/6c955708ecfa70ff55d363c485f9eb1e
No wait -- the log ends at
pull-export/index.html
DOCTYPE html []
--- pull-export/index.html --> <nil>
testDocument on test/index.html
panic: runtime error: invalid memory address or nil pointer dereference
so which of these two is likely the culprit? pull-export/index.html
or test/index.html
?
I have another file on which it consistently crashes, but if I test only that file, it passes.
It'll be test/index.html
there. The debug message "testDocument on…" is the first call when finished with the last doc and starting the next.
It crashes on more files now. I've removed test/index.html since, but I still have others. My current run ends with
testDocument on installation/configuration/hidden-preferences/index.html
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x30 pc=0x1111246]
but it may not be something about that file in particular; if I set DirectoryPath to public/installation/configuration
(it's usually set to public
), it does not crash.
My site is available as a tarball on https://0x0.st/z2DV.gz
(but that tarball was produced on MacOS, which means that Support
and support
are deemed to be the same file)
Thanks! I'm on holiday next week but will try to have a look at this at some point in July.
Thanks! Is there anything I can do in the interim to help debugging this?
I've tried this on a linux system and it runs without issue there.
Ah, that's very interesting. I'm no expert on osx (only access I have is the Travis test runners). Right now, without looking at code, unfortunately I don't have any ideas.
No issue. When you're back I'd be happy to run an instrumented version that may give more insight.
I'm seeing the same on Linux on Ubuntu Focal:
node@791983aec7ee:~/antora-base$ ./bin/htmltest -l0 public
htmltest started at 09:18:46 on public
========================================================================
0: DirectoryPath string = public
1: DirectoryIndex string = index.html
2: FilePath string =
3: FileExtension string = .html
4: CheckDoctype bool = true
5: CheckAnchors bool = true
6: CheckLinks bool = true
7: CheckImages bool = true
8: CheckScripts bool = true
9: CheckMeta bool = true
10: CheckGeneric bool = true
11: CheckExternal bool = true
12: CheckInternal bool = true
13: CheckInternalHash bool = true
14: CheckMailto bool = true
15: CheckTel bool = true
16: CheckFavicon bool = false
17: CheckMetaRefresh bool = true
18: EnforceHTML5 bool = false
19: EnforceHTTPS bool = false
20: IgnoreURLs []interface {} = []
21: IgnoreDirs []interface {} = []
22: IgnoreInternalEmptyHash bool = false
23: IgnoreEmptyHref bool = false
24: IgnoreCanonicalBrokenLinks bool = true
25: IgnoreExternalBrokenLinks bool = false
26: IgnoreAltMissing bool = false
27: IgnoreDirectoryMissingTrailingSlash bool = false
28: IgnoreSSLVerify bool = false
29: IgnoreTagAttribute string = data-proofer-ignore
30: HTTPHeaders map[interface {}]interface {} = map[Accept:*/* Range:bytes=0-0]
31: TestFilesConcurrently bool = false
32: DocumentConcurrencyLimit int = 128
33: HTTPConcurrencyLimit int = 16
34: LogLevel int = 0
35: LogSort string = document
36: ExternalTimeout int = 15
37: StripQueryString bool = true
38: StripQueryExcludes []string = [fonts.googleapis.com]
39: EnableCache bool = true
40: EnableLog bool = true
41: OutputDir string = tmp/.htmltest
42: OutputCacheFile string = refcache.json
43: OutputLogFile string = htmltest.log
44: CacheExpires string = 336h
45: NoRun bool = false
46: VCREnable bool = false
47: Version string = 0.12.1
testDocument on Home/faq.html
Home/faq.html
DOCTYPE html []
--- Home/faq.html --> <nil>
from cache --- Home/faq.html --> https://docs.tpwiki.com/Home/faq.html
OK --- Home/faq.html --> https://docs.tpwiki.com/Home/faq.html
from cache --- Home/faq.html --> https://docs.tpwiki.com
OK --- Home/faq.html --> https://docs.tpwiki.com
target does not exist --- Home/faq.html --> /oauth2/sign_out
testDocument on Home/index.html
Home/index.html
DOCTYPE html []
--- Home/index.html --> <nil>
from cache --- Home/index.html --> https://docs.tpwiki.com/Home/index.html
OK --- Home/index.html --> https://docs.tpwiki.com/Home/index.html
from cache --- Home/index.html --> https://docs.tpwiki.com
OK --- Home/index.html --> https://docs.tpwiki.com
target does not exist --- Home/index.html --> /oauth2/sign_out
testDocument on SEL751_Arc_Flash_Protection_Settings/unstable/downloads/Downloads.html
SEL751_Arc_Flash_Protection_Settings/unstable/downloads/Downloads.html
DOCTYPE html []
--- SEL751_Arc_Flash_Protection_Settings/unstable/downloads/Downloads.html --> <nil>
from cache --- SEL751_Arc_Flash_Protection_Settings/unstable/downloads/Downloads.html --> https://docs.tpwiki.com
OK --- SEL751_Arc_Flash_Protection_Settings/unstable/downloads/Downloads.html --> https://docs.tpwiki.com
target does not exist --- SEL751_Arc_Flash_Protection_Settings/unstable/downloads/Downloads.html --> /oauth2/sign_out
from cache --- SEL751_Arc_Flash_Protection_Settings/unstable/downloads/Downloads.html --> https://gitlab.tpwiki.com/standard-designs/arc-flash-protection/SEL751_Arc_Flash_Protection_Settings/tree/master
OK --- SEL751_Arc_Flash_Protection_Settings/unstable/downloads/Downloads.html --> https://gitlab.tpwiki.com/standard-designs/arc-flash-protection/SEL751_Arc_Flash_Protection_Settings/tree/master
from cache --- SEL751_Arc_Flash_Protection_Settings/unstable/downloads/Downloads.html --> https://gitlab.tpwiki.com/standard-designs/arc-flash-protection/SEL751_Arc_Flash_Protection_Settings/issues
OK --- SEL751_Arc_Flash_Protection_Settings/unstable/downloads/Downloads.html --> https://gitlab.tpwiki.com/standard-designs/arc-flash-protection/SEL751_Arc_Flash_Protection_Settings/issues
from cache --- SEL751_Arc_Flash_Protection_Settings/unstable/downloads/Downloads.html --> https://gitlab.tpwiki.com/standard-designs/arc-flash-protection/SEL751_Arc_Flash_Protection_Settings/compare/master...master
OK --- SEL751_Arc_Flash_Protection_Settings/unstable/downloads/Downloads.html --> https://gitlab.tpwiki.com/standard-designs/arc-flash-protection/SEL751_Arc_Flash_Protection_Settings/compare/master...master
testDocument on SEL751_Arc_Flash_Protection_Settings/unstable/setting_guide/Setting_Guide.html
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x30 pc=0x51c0d7]
goroutine 1 [running]:
github.com/wjdp/htmltest/htmldoc.(*Document).Parse(0x0)
/home/travis/gopath/src/github.com/wjdp/htmltest/htmldoc/document.go:47 +0x37
github.com/wjdp/htmltest/htmldoc.(*Document).IsHashValid(...)
/home/travis/gopath/src/github.com/wjdp/htmltest/htmldoc/document.go:112
github.com/wjdp/htmltest/htmltest.(*HTMLTest).checkInternalHash(0xc0000ce240, 0xc0003210b0)
/home/travis/gopath/src/github.com/wjdp/htmltest/htmltest/check-link.go:325 +0xb0
github.com/wjdp/htmltest/htmltest.(*HTMLTest).checkInternal(0xc0000ce240, 0xc0003210b0)
/home/travis/gopath/src/github.com/wjdp/htmltest/htmltest/check-link.go:299 +0x15d
github.com/wjdp/htmltest/htmltest.(*HTMLTest).checkLink(0xc0000ce240, 0xc0000fe480, 0xc0001ed0a0)
/home/travis/gopath/src/github.com/wjdp/htmltest/htmltest/check-link.go:97 +0x5ec
github.com/wjdp/htmltest/htmltest.(*HTMLTest).testDocument(0xc0000ce240, 0xc0000fe480)
/home/travis/gopath/src/github.com/wjdp/htmltest/htmltest/htmltest.go:204 +0x18c
github.com/wjdp/htmltest/htmltest.(*HTMLTest).testDocuments(0xc0000ce240)
/home/travis/gopath/src/github.com/wjdp/htmltest/htmltest/htmltest.go:183 +0x65
github.com/wjdp/htmltest/htmltest.Test(0xc000013950, 0xc000010018, 0xc0000f9d48, 0x1)
/home/travis/gopath/src/github.com/wjdp/htmltest/htmltest/htmltest.go:143 +0x89b
main.run(0xc000013950, 0xc000013950)
/home/travis/gopath/src/github.com/wjdp/htmltest/main.go:159 +0x207
main.main()
/home/travis/gopath/src/github.com/wjdp/htmltest/main.go:66 +0x268
My system is:
Linux 791983aec7ee 4.19.0-8-amd64 #1 SMP Debian 4.19.98-1 (2020-01-26) x86_64 x86_64 x86_64 GNU/Linux
running within a Docker container.
Happy to provide further information. This error is highly consistent and always occurs.
My directory is also public
but I tried public2
and 2xxx
both of which it also crashed on with the same errors.
This seems to be an issue with parsing HTML. I know this issue is very old but @danyill do you have a copy of the files that caused the crash?
@wjdp Sorry for the slow response, time is getting away on me. I have a copy of a very similar one which also crashes on the latest version of htmltest. I can't share this publicly but am happy to provide it with you. What is the easiest way to provide this to? Can I email it to your commit address? (1.5 Mb file with embedded images).
Hi, @wjdp. I was able to replicate this error.
In my case, I have 2 pages, first page has an anchor link to another page
page 1 public/docs/dev/index.html
...
<!DOCTYPE html>
<html>
<title>test</title>
<meta name="viewport" content="width=device-width, initial-scale=1">
<body>
<a href="/docs/hello-configuration/#link">
<code class="language-text">link</code>
</a>
</body>
</html>
...
page 2 public/docs/hello-configuration/index.html
...
<!DOCTYPE html>
<html>
<title>test</title>
<meta name="viewport" content="width=device-width, initial-scale=1">
<body>
<h2 id="link" style="position:relative">
<a href="#link">
</a>link
</h2>
</body>
</html>
...
.htmltest.yml
:
IgnoreDirectoryMissingTrailingSlash: true
DirectoryPath: "public/"
IgnoreAltMissing: true
OutputDir: "tmp/.htmltest"
OutputCacheFile: "refcache.json"
OutputLogFile: "htmltest.log"
IgnoreDirs:
- hello
Links are valid since when I run it on localhost or server, links work OK. Is there any workaround? Please let me know if you need more details.
UPDATED (27.10.21): once I remove DirectoryPath: "public/"
from .htmltest.yml
. it seems to be working
UPDATED (28.10.21):
hello-configuration/index.html
and in config, I have IgnoreDirs
-> - hello
. this is a case! once I renamed my file or remove IngoreDirs or change name hello
to hello/
, this error is gone. . Been digging into this while watching tv 😄
I'm quite sure I've narrowed down the culprit:
Debugging shows me that hT.documentStore.ResolveRef(ref)
can return a response of (nil, false)
, but the ok
value is never checked.
The way I'm currently fairly sure I can hit this issue is one of two ways:
htmltest
at a html page that has links to parent directorieshtmltest
at a html page, but have valid (at least I think they are valid) links to ignored set of pages covered by IgnoreDirs
.From there, any call to member functions will panic if they reference internal members.
I'll keep digging, but I wanted to report on my progress in case it spurred someone else to see the correct path through to resolving this issue.
So easy enough fix for the panic, check the ok
value returned from ResolveRef
(my branch is over here):
The next issue I run into, is that this reference I have should resolve, but it doesn't because (I assume) the reference it points to isn't available in DocumentPathMap
since it matches IgnoreDirs
🤔 Now to work out how that gets populated.
Okay! I think I got it working! I had to keep the list of all Document
in DocumentStore
and add a property to say if they should be ignored for test or not - that allows for references to be checked against, but can be skipped over for testing.
PR coming shortly!
htmltest is erroring out when I run it:
To Reproduce
Steps to reproduce the behaviour:
.htmltest.yml
Please copy in your config file
Source files
I haven't been able to narrow it down yet -- my request is for htmltest to print the page it's processing to help narrow it down.
Expected behaviour
print each page as it's being processed
Versions