knex / knex

A query builder for PostgreSQL, MySQL, CockroachDB, SQL Server, SQLite3 and Oracle, designed to be flexible, portable, and fun to use.
https://knexjs.org/
MIT License
19.2k stars 2.12k forks source link

Support for full text search indexes #203

Open elliotf opened 10 years ago

elliotf commented 10 years ago

Postgres, sqlite, and MySQL all support some form of full-text index. It would be useful to support generating schema for and querying based on these indexes.

This does not seem like a trivial feature, as the implementations are very different for index creation requirements (sqlite appears to be at the table-level, MySQL only supports it on MyISAM tables) and querying.

That said, this issue could at least be used to track interest for such a feature.

References:

Example index creation and querying:

CREATE INDEX <INDEX NAME> ON <TABLE NAME> USING gin(to_tsvector(<COLUMN NAME>));

SELECT * FROM <TABLE NAME> WHERE <COLUMN NAME> @@ to_tsquery(<INPUT>);
CREATE FULLTEXT INDEX <INDEX NAME> ON <TABLE NAME>;

SELECT * FROM <TABLE NAME> WHERE MATCH <COLLUMN NAME> AGAINST <INPUT>;
CREATE VIRTUAL TABLE <TABLE> USING fts3(<COLUMN NAME> TEXT);

SELECT * FROM <TABLE NAME> WHERE <COLUMN NAME> MATCH <INPUT>;
nfour commented 10 years ago

Seconded of course, I thought it might have already been supported as it seems kinda important.

tgriesser commented 10 years ago

Will be supported in the next minor release, I'm shooting for the end of March for that.

elliotf commented 10 years ago

@tgriesser wow, that's unexpected but awesome. Thank you!

ErisDS commented 10 years ago

@tgriesser Did this actually happen?

tgriesser commented 10 years ago

It's not released yet. Let me go though and make sure this got added though, trying to close out almost every remaining ticket here with the new version. I know you can now rename column and change types :)

ErisDS commented 10 years ago

Oooooooh exciting! :tada: cc/ @sebgie

tgriesser commented 10 years ago

@ErisDS so I'm looking at this again, just wanted to make sure how I was thinking about going about it would work... was thinking you'd do t.text('colName').fulltext() or t.fulltext(columnName)... just wanted to make sure you were aware though that in sqlite3 you lose the ability to do any other indexing on the table and the ability to add columns. Presumably, that functionality will be wanted, so I'll have to add it to the sqlite3 ddl stuff, but just wanted to point it out.

ErisDS commented 10 years ago

just wanted to make sure you were aware though that in sqlite3 you lose the ability to do any other indexing on the table and the ability to add columns. Presumably, that functionality will be wanted, so I'll have to add it to the sqlite3 ddl stuff, but just wanted to point it out.

You mean you lose ability to add columns normally, and have to do the whole table-copy thing?

tgriesser commented 10 years ago

Yep. Seems to be the story with pretty much everything around modifying sqlite, except now it's even on adding columns.

ErisDS commented 10 years ago

Wooohoooooo :balloon: ... lol

In Ghost - there will be the post markdown, the title, and perhaps the tags names that we'll want to do FTS on I think to start with... not sure if that helps.

ericclemmons commented 10 years ago

I just discovered FULLTEXT ... MATCH in MySQL (such a nub), spent an hour trying to force a fullText: function(name) { this.isFullText = name || true; return this; } into SchemaBuilder then tried to shoehorn it into SchemaGrammar without scucess, then I found this thread ;)

Need some funding for 0.6 @tgriesser? I dunno how you've done this much already :)

MetaMemoryT commented 10 years ago

:+1:

dwstevens commented 9 years ago

@tgriesser @ErisDS For sqllite and FTS maybe following this pattern would work? http://peewee.readthedocs.org/en/latest/peewee/playhouse.html?highlight=full%20text#FTSModel They create a separate table specifically to use in the full text search index. That way they avoid changing all columns to text in the main table.

ErisDS commented 9 years ago

@dwstevens that could work, not sure where we are on FTS here. Bookshelf supports plugins, perhaps something like this could be done with a plugin and knex raw? Would be an awesome thing for someone to have a play with and see if they could get a working version together.

ErisDS commented 9 years ago

I've just raised a discussion issue on Ghost about implementing search, it's linked here. It's quite literally the oldest issue on our repository and we desperately need to move forward with it.

@tgriesser what's the status here in knex? Are there any plans? We could really use your input.

bretmattingly commented 9 years ago

@ErisDS Our solution with Sequelize was to use raw Sequelize queries for our FT indexes. Now that I'm trying to switch to a DAO pattern and using Knex, I'll likely implement something similar. It's not as elegant as having built-in support, but it'll do.

@tgriesser I'd like to help if I can!

ErisDS commented 9 years ago

@bomattin I think most people are using raw queries, but are only having to support a single DB, where we need to have support for Sqlite3, MySQL & pg. Are you planning to write something generalised?

bretmattingly commented 9 years ago

@ErisDS I'm going to start looking into it. Unfortunately most of my programming time for the next week or so is going to be building my DAOs and trying to sell my fellow devs on it. I'm taking a look at the Knex codebase when I can so I can try to dive in and do it right quickly.

s-stude commented 9 years ago

Hello! Any luck on this?

bretmattingly commented 9 years ago

@s-stude No movement on my end, unfortunately. This is what I was using for the time being:

Users.find = function(params, options) {
    return new Promise(function(resolve, reject){
        var options = options? options : {};
        var whereclause;
        var lim = options.limit? options.limit : 65536; // Need a valid number here? 50 isn't a bad default.
        knex.select(cols.user)
            .from('users')
            .orWhereRaw('MATCH(firstname_preferred,lastname_preferred,username) AGAINST(? IN BOOLEAN MODE)', params.search)
            //.orWhereRaw('MATCH(groups.name) AGAINST(? IN BOOLEAN MODE)', '+matt*')
            .limit(lim)
            .then(function(userresults){
                resolve(new Users(userresults))
            })
            .catch(function(err){
                reject(err)
            });
    })
};
hdzidic commented 8 years ago

Any updates here?

This is what I'm using as a workaround:

query.whereRaw('to_tsvector(parts.description) || to_tsvector(cars.engine_name)
        || to_tsvector(makes.name) || to_tsvector(models.name)
        || to_tsvector(part_types.name) || to_tsvector(part_types.description)
        || to_tsvector(part_categories.name)
        || to_tsvector(part_manufacturers.name) @@ to_tsquery(?)',req.query.search);
cbrunnkvist commented 8 years ago

I looked at http://knexjs.org/#Schema-index and missed the little detail about "index type is only supported on PostgreSQL", so I ran the following as part of my migration:

return knex.schema.table('host', t => {
  t.index('name', 'name_idx', 'FULLTEXT') # <<< I wish
})

resulting .debug() output against mysql:

[ { sql: 'alter table `host` add index name_idx(`name`)',
    bindings: [] } ]

where I suppose it should have said add FULLTEXT index... Is there anything in particular that is blocking the Type argument from being used for the mysql dialect?

yamikuronue commented 8 years ago

What happened with this? Is there a holdup? Can I be of help?

elhigu commented 8 years ago

@yamikuronue sure, looks like major databases support this, so common API to create index would be nice. Pull requests are welcome if you like to implement this.

yamikuronue commented 8 years ago

Was any work already done I can build on, or should I start over from scratch? I'm not sure I'll be able to do it, of course, but I'm willing to take a look at least

elhigu commented 7 years ago

removed flood +1 and status update request comments, please use thumbs for voting

EDIT: if there are no new messages in this feed, then there is nothing new to tell

Palisand commented 6 years ago

So how can this be done in the meantime in one top-level invocation (i.e. createTable) without having to use raw for the entire CREATE TABLE statement? Or must we createTable, then:

knex.raw('alter table `table` add fulltext (`col`)');

?

This actually isn't all that annoying. I keep all my schemas in one directory, with file names corresponding to table names. So for some table foo that does not require any unsupported statements:

// foo.js

module.exports = function (table) {
  table.string(...);
};

and for some table bar that requires a FULLTEXT index:

// bar.js

module.exports = {
  builder: function (table) {
    table.string(...);
  },
  raw: function () {
    return knex.raw('alter table `bar` add fulltext (`col`)');
  }
};

Then I just:

function createTable(table) {
  const schema = require('schemas/' + table);
  return (
    typeof schema === 'function'
      ? db.schema.createTable(table, schema)
      : db.schema.createTable(table, schema.builder)
        .then(() => schema.raw())
  ).catch(err => {
    logError(err.message);
  })
}
rapodaca commented 6 years ago

This issue was the top hit for many of the searches I did, but didn't address my question. For those who also end up here for the same reason...

Postgres users can add a full text index to a tsvector column using the third argument to table.index. For example:

exports.up = (knex) => {
  return knex.schema.createTable('foo', (table) => {
    table.increments('id');
    table.specificType('fulltext', 'tsvector');

    table.index('fulltext', null, 'gin');
  );
);

Also see:

https://knexjs.org/#Schema-index

and

https://stackoverflow.com/questions/45871474/how-to-add-gin-index-using-knex-js

kibertoad commented 5 years ago

@rapodaca Thank you for your contribution, added this information to the wiki!

montera82 commented 5 years ago

this may also be a helpful workaround http://blog.victorquinn.com/full-text-search-with-bookshelfjs

davidlandais commented 4 years ago

Does this still working for you guys ? I have tried to use table.index('title', null, 'FULLTEXT') and the SQL result is :

alter table `products` add index `products_title_index`(`title`)

Even using table.index('title', null, 'BOOM'), knex isn't throwing an error and the sql result is the same.

Did i missed something ?

Using RAW query is working.

ALTER TABLE products ADD FULLTEXT(title)
denysaw commented 4 years ago

Guys ) I've thought off workaround without any .then()'s: table.index(null, 'product_fulltext_index', "GIN (to_tsvector('english', name || ' ' || description)); SELECT NOW");

So knex guys should just remove () when fieldName is null and we could get rid of SELECT NOW appendix ))

elhigu commented 4 years ago

Deleted comment that was completely unrelated (about general use of sql indices).

jawadcode commented 3 years ago

It's been quite a while, any updates?

elhigu commented 3 years ago

Nobody hasn't even properly designed how this should work exactly. Doing full text search properly on databases has never been as easy as it sounds. It wont be as advanced as proper FTS engines are, but could be useful for some use cases....

kibertoad commented 3 years ago

PRs definitely would be most welcome.

AllanJard commented 2 years ago

A little addition to this for anyone who finds this thread as I did while looking for how to add custom indexes for full text search (trigram search really) in postgres. I used @denysaw's idea above and came up with:

knexPg.schema.alterTable('users', function (table) {
  table.index(null, 'idx_name', 'gin( col gin_trgm_ops); SELECT NOW');
});

Which gives:

create index "idx_name_last_name" on "users" using gin( col gin_trgm_ops); SELECT NOW ()

This might count as SQL injection ;)