solbu / hldig

hl://Dig is a fork of ht://Dig, a web indexing and searching system for a small domain or intranet
https://solbu.github.io/hldig/
Other
18 stars 21 forks source link

[discussion] future of libhtdigphp #110

Open andy5995 opened 6 years ago

andy5995 commented 6 years ago

I'm thinking it may be worth it to move libhtdigphp to a separate repo.

I was taking a look at it and one problem is that it's looking for hldig in the wrong directories, since we moved many of the files to /opt/www.

Another problem is looking at the Makefile

https://github.com/solbu/hldig/blob/3d3f7eb9fc805b59ce14142e6471bcb5c1b48466/libhtdigphp/Makefile#L1-L15

I don't know what was stored on the nfs servers! I just downloaded the cvs code cvs -z3 -d:pserver:anonymous@a.cvs.sourceforge.net:/cvsroot/htdig co -P htdig from sourceforge and no big differences really stuck out to me.

When I ran make on htdig-3.2.0b6:

andy@oceanus:~/temp/htdig-3.2.0b6/libhtdigphp$ make
Makefile:13: /nfs/users/rnw/nealr/code/htdig/htdig-CVS-linux/libhtdigphp/build/dynlib.mk: No such file or directory
make: *** No rule to make target '/nfs/users/rnw/nealr/code/htdig/htdig-CVS-linux/libhtdigphp/build/dynlib.mk'.  Stop.

From https://github.com/solbu/hldig/blob/master/archived_docs/htdig40_refactor.pdf this tells a bit about what it's supposed to do:

4.3
 Removed Features
Originally, several different auxiliary databases were avail-
able for ”fuzzy” searching though the htfuzzy tool. Algo-
rithms like synonyms, metaphone and soundex have been re-
moved completely, the endings database has been subsumed
into the main CLucene index as a searchable stemmed field,
and the accents database is no longer necessary, being re-
quired only in the context of an ASCII-only index. Also
removed are some of the database management tools like
htmerge, htdump, and htload. The htmerge tool was used
to merge BDB indexes created during different runs of htdig
into one.
Gone also is the CGI based searching executable that was
part of ht://Dig. This has been replaced with a simple PHP
API that provides the same functionality. This conforms
better to the idea of using HtDig as a utility library.

Well, our distribution still has htfuzzy. That doc is apparently referring to the 4.0 version, code from which has been lost, according to @roklein in https://github.com/roklein/htdig/issues/1#issuecomment-342462767

Thank you for your interest in this project. I think it's better you fork and continue there. My repository is basically up to version 3.2b6 and some patches from further down the tree and those used by some major distributions (suse, fedora,...). So in a way, this is probably the most stable version of htdig, currently. Unfortunately it seems, I only put the one tree into the repository (I converted it from CVS), so the older 3.1.6 stuff isn't here. Also the early 4.0 code is missing, but I think you are way better off, not using it. Plese note, there is a fork from jrsupplee which has three additional patches. I didn't review those, however, and never got a pull request.

Best regards,
Robert

The only code I know of from 4.x is on our 4_1_0 branch. I tried building it a few times but could only build part of it. Based on what Robert said, I assumed parts of the code were missing.

I suggest that the whole libhtdigphp directory could be moved into a separate repo. We'd link to the repo, and perhaps add it as a submodule.

roklein commented 6 years ago

No, my comment meant, in my repository the ht://Dig 4 stuff is missing. If I recall correctly one of the first things done in the ht://Dig 4.0 branch was replacing searching and indexing with lucene. IMO that means heart and brain of ht://Dig get ripped out and replaced by something else. What remains is basically the web interface and glue code. I may be overboard with my opinion, but back then I was more interested in the search and indexing part of htdig.

Note the code is really outdated, as also explained in that 4.0 refactoring document, you linked above. What the document doesn't state explicitly is, htdig was written before C++ became standardized, and in particular the STL is not used in ht://Dig.

andy5995 commented 6 years ago

Thanks again for getting me up to speed, @roklein . Seems like we should do what you suggested originally, and keep going as we are, forgetting about 4.x.

The libhtdigphp stuff may as well stay here, in case someone ever wants to mess with it, or learn from it, but it shouldn't be included by make dist at the present time, imo.

solbu commented 6 years ago

Regarding submodules, the only requirement I have on these, the build process cannot depend on external sources. A distribution should only have to download the tarball, untar it, run configure, make and make install. Nowhere in the build process should an internet connection be required, as many distriobuions then can't build it. The archive the user download must be self contained.

One can however have the other repo code as dependency, for exampel like libstdc++ and libssl-dev is a requirement for compiling.

andy5995 commented 6 years ago

yeah, that won't be an issue. hldig is not dependent on the stuff in libhtdigphp at all. And I think it's fine to leave things where they are at for now, but adding something to README that states something like: "The package in this directory doesn't currently build and isn't being maintained while we focus on hldig. Anyone interested in working on it can submit a patch however, or contact us to discuss it."