mdbtools / mdbtools

MDB Tools - Read Access databases on *nix
GNU General Public License v2.0
980 stars 208 forks source link

mdb-sql: <= and >= logically backwards #283

Closed Bob620 closed 3 years ago

Bob620 commented 3 years ago

I use the official Windows drivers when operating on Windows. I was sending the (almost) same command to both the drivers and mdb-sql but was receiving different output.

The command in question is:

SELECT * FROM Line WHERE 7.4204006 <= StageX AND 8.1351595 >= StageX AND -3.7076371 <= StageY AND -3.1693828 >= StageY

or for the official driver:

SELECT * FROM [Line] WHERE 7.4204006 <= StageX AND 8.1351595 >= StageX AND -3.7076371 <= StageY AND -3.1693828 >= StageY

This command in my test dataset returns 10 items using the driver, but 0 using mdb-sql.

I at first assumed I was insane and doing something wrong, but after double and triple checking I found that I was in fact querying the data correctly and that the output was logically correct.

The first thing I tried was to swap the >= and <= which fixed all the problems and returned the expected 10 elements.

I will make a pull request to fix this shortly.

Bob620 commented 3 years ago

After reviewing the code further and taking suggestions from @evanmiller, I have come up with some points:

The issue is that the code assumes a handedness to the operation which causes translation errors:

x > 6 --> node.value.d > 6
6 < x --> node.value.d < 6

The best way I can come up with to fix this would be to parse what side the value is located on and give the node a 'handedness' that allows it to correctly position values.

This all goes beyond me without spending a lot of time to understand how the code works as I am inexperienced with C and its not a repository I have worked on before.

Alternately, a 'handedness'-based operator could be introduced into the lexer potentially. this would require a constant on one side or the other and change the logic operation to correspond to the correct translated operation.

pedromorgan commented 3 years ago

Ok so that sql is weird to me.. as a bystander..

Is there a particular reason the "query" is that way.. or am i missing something ??

Bob620 commented 3 years ago

@pedromorgan I've never written SQL outside of several very specific applications, I don't generally work with databases that use it.

The goal is that I have a specific 2d plane, with a number of specific positions. Each position has a StageX and StageY position and I need to select all the positions in the table that fall into the 2d sub-space.

Rather than selecting all the positions from the database as there could be thousands or more, I wanted to minimize the number of output to as small as possible. This command simply was my brain coming up with a method to do that.

@evanmiller told me in https://github.com/mdbtools/mdbtools/pull/284#issuecomment-806650745

It's possible that the engine is fooled because your constants are to the left of your columns names (it's more typical to write X > 10 rather than 10 < X)

So I may just be doing things in non-standard sql methods that seem normal to me from my previous experience outside of SQL.

evanmiller commented 3 years ago

@Bob620 Excellent write-up of the problem. This should enable whoever tackles the issue to devise a solution.

As an aside, based on my general experience writing parsers, the lexer will not be the correct place for a fix - lexers should be "dumb" and just decide what each token is. To start I would look here, in the parser:

https://github.com/mdbtools/mdbtools/blob/ff3c6ebd01622e80615c9d1591892f9b7d80ce71/src/sql/parser.y#L147-L157

evanmiller commented 3 years ago

@Bob620 Your SQL is fine (if perhaps a touch unconventional). All that matters is that there's a bug here that needs fixing!

pedromorgan commented 3 years ago

@Bob620 thanks for explain.. as a muppet self.taught coder I kinda maybe get how it works.. and interesting indeed.

A silly question.. would that query work on sqlite ??

pedromorgan commented 3 years ago

ohh hat off (socks on) to @evanmiller .. what your doing is cool and fantastic.. ;-))

Bob620 commented 3 years ago

@evanmiller thanks for the confirmation that Im not going insane with my SQL logic, would be interested as to why its unconventional, but thats a different conversation. I have no training, seniors, or general education in SQL other than what I find online at this time.

I was looking at the parser but I was very confused as to how it worked. you posting that fragment helped clear some stuff up in my brain and Ill take another look at it. If I get something working I'll make a pull request.

Good to know that the lexer should be as dumb as possible and the parser should be more logic-oriented. I'm trying not to completely ruin the repository!

@pedromorgan I'm not 100% sure about sqllite, but it works as expected in the official Windows drivers. My current code uses Node-Adodb to connect a NodeJS process to .mdb files as I need to access data another program creates.

My specific code that runs the commands is: https://github.com/Bob620/probelab-reimager/blob/fe356e528ebfd35b26a82672255756182d39cbe2/io.js#L86-L142

I have a local version that uses the mdb-sql standalone executable and required shared libraries to do the process on linux and mac since I want to support MDB on Mac.

Bob620 commented 3 years ago

@evanmiller the code formatting is cursed and inconsistent, and it really makes me want to make a pull request just to format it all the same way, but it would also not be the expected/current way because I would just use Jetbrains C/C++ default formatting.

evanmiller commented 3 years ago

@Bob620 The formatting throughout the entire project is a mess! But I leave it more or less intact so that it's easy to track the history on a 20-year-old inherited code base :-)