solbu / hldig

hl://Dig is a fork of ht://Dig, a web indexing and searching system for a small domain or intranet
https://solbu.github.io/hldig/
Other
18 stars 21 forks source link

hldig can't crawl https URLS anymore #122

Closed andy5995 closed 6 years ago

andy5995 commented 6 years ago

hldig can't crawl https URLs.

On the dreamhosters server, I got a segfault when I ran rundig. Last time I used it there, which was a few months ago, it worked.

Which means that the demo won't work, because that's where hldig is running from.

On my system, it didn't segfault, but it didn't crawl correctly.

andy@oceanus:~/src/hldig/install-test/bin$ ./hldig -ivs
hldig Start Time: Tue Sep  4 01:25:51 2018

New server: solbu.github.com, 443
0:2:0:https://solbu.github.com/hldig/:  redirect
hldig: Run complete
hldig: 1 server seen:
hldig:     solbu.github.com:443 1 document

HTTP statistics
===============
 Persistent connections    : Yes
 HEAD call before GET      : Yes
 Connections opened        : 2
 Connections closed        : 1
 Changes of server         : 0
 HTTP Requests             : 2
 HTTP KBytes requested     : 0.939453
 HTTP Average request time : 0 secs
 HTTP Average speed        : inf KBytes/secs

hldig End Time: Tue Sep  4 01:25:51 2018
andy5995 commented 6 years ago

The problem started at some point between these 2 commits

754bf32 888007d

solbu commented 6 years ago

And it only showed up now?

solbu commented 6 years ago

Besides, it is using v0.2.0. Maybe you should try the master branch on the server?

andy5995 commented 6 years ago

The problem originated in https://github.com/solbu/hldig/commit/49740ad

Unfortunately, I'm not sure what I skipped while testing that patch. But I must have forgotten something because it worked for me at the time.

And also when I tested 888007d the test failed, which was incorrect.

The problem started at some point between these 2 commits 754bf32 888007d

Anyway, hopefully #123 will fix this.

And it only showed up now?

The demo site can work indefinitely, as long as the database was created with a binary that doesn't segfault. I only tried to update it a few days ago because you made changes to the web site and so I wanted to re-index it.

Ergo, using the demo site to search has been working up until the time I opened this issue. Once #123 is merged, I'll update the demo site.