shadisabzali / dataparksearch

Automatically exported from code.google.com/p/dataparksearch
GNU General Public License v2.0
0 stars 0 forks source link

MS SQL Server Compatability/Issues #28

Open GoogleCodeExporter opened 9 years ago

GoogleCodeExporter commented 9 years ago
In our quest to use a database to have full sub-string searching
capability, we have compared the performance of postgres, mysql and MS SQL
Server 2005. Without a doubt, sql server is the fastest for full index
scans on tables with 100+ million rows.

So we are trying to get dpsearch to work with SQL Server 2005. We are using
multi-mode via unixODBC and FreeTDS (latest versions for both). 

1. The scripts to create the db tables are out of date. Several tables
(robots and cookies) are missing and the some fields are missing like
charset_id from url. There are a few other issues. These issue were easily
resolved, but can they be fixed in the distribution? If you would like, I
could provide the updated files? 

2. When performing initial setup with the -Ecreate command, it works fine.
The srvinfo table appears to be populated correctly. During indexing no
documents are indexed, but no errors. Also, a simple commands like
"./indexer -S" returns an error. 

When running ./indexer -S with the debug_sql #define turned on in
sqldbms.c, the error message is:
{sqldbms.c2621} Query: COMMIT
    SQL-server message: [unixODBC][FreeTDS][SQL Server]The COMMIT
TRANSACTION request has no corresponding BEGIN TRANSACTION. Then the same
line repeats

The resulting output/statistics from the -S command is empty, just the
headers and predefined content.

I have used SQL Server profiler and captured the commands sent to the
server by dpsearch, they are as follows:
1) SET IMPLICIT_TRANSACTIONS ON
2) IF @@TRANCOUNT > 0 COMMIT
3) Select status, sum(case when next_index_time <= 1266421936 then 1 else 0
end), count(*), sum(docsize), sum(case when next_index_time <= 1266421936
then docsize else 0 end) from url Group By status order by status
4) If @@TRANCOUNT > 0 COMMIT
5) COMMIT
6) IF @@TRANCOUNT > 0 COMMIT

I've run these commands (as a batch) directly on SQL Server and they return
with the same error message. If I remove the standalone commit in line 5 it
works. 

I have also successfully run the same set of commands via tsql, w/out the
extra commit, (comes with FreeTDS) and the data is returned successfully.

If I comment out the COMMIT being sent by dpsearch near line 2621 of
sqldbms.c then I don't receive the SQL Server error message, but the status
results are still zero and indexing is still not performed.

It appears that select statements are not functioning properly, but inserts
are working.

Are there special options to compile unixODBC and FreeTDS for use with
dpsearch relating to auto-commit of transactions? Any other thoughts?

Our version is dpsearch-4.53-19012010 compiled with multi-mode and unixODBC
support.

Thanks in advance.

Original issue reported on code.google.com by Imlbr...@gmail.com on 17 Feb 2010 at 4:30

GoogleCodeExporter commented 9 years ago
1. It would be nice to get updated files to create database files. Please 
attach them 
in reply or send to me, my e-mail: maxime [at] maxime dot net dot ru

2. dpsearch doesn't send commands 1),2),4) and 6) and definitively it should 
send 
BEGIN TRANSACTION before sending COMMIT. Which DBAddr command do you use ?
Could you please, show full output for "indexer -S" command with DEBUG_SQL 
uncommented 
in sqldbms.c file ? 

Original comment by dp.max...@gmail.com on 18 Feb 2010 at 6:36