Open natemcmaster opened 8 years ago
combined into https://github.com/aspnet/EntityFramework/issues/1590
Un-combining.
You are talking about fts 3 and 4, will you also support 5?
Now that SQLitePCL.raw supports FTS5, we'd probably just start with support for that.
Some notes:
match()
function to search
SELECT * FROM email WHERE match($query, body)
SELECT * FROM email WHERE match($query, email)
bm25()
(rank), highlight()
and snippet()
functions also take the table identifier as an argumentDaydreaming about this some more:
LINQ like the this...
from e in db.Emails
where EF.Functions.Match(e, query)
orderby EF.Functions.Bm25(e)
select new
{
e.Id,
Subject = EF.Functions.Highlight(e.Subject, "<b>", "</b>"),
Body = EF.Functions.Snippet(e.Body, "<b>", "</b>", "...", 64)
};
...would produce SQL like this.
select
id,
highlight(email, subject, '<b>', '</b>') as subject,
snippet(email, body, '<b>', '</b>', '...', 64) as body
from email
where match(email, $query)
order by bm25(email);
Notice how expressions like Highlight(e.Subject, ...)
are translated into highlight(email, subject, ...)
and you could write either Match(e, ...)
(translates to match(email, ...)
) or Match(e.Body, ...)
(translates to match(body, ...)
).
Obviously, Match
, Bm25
, Highlight
, and Snippet
cannot be evaluated on the client and should throw.
Note, some functionality is already possible in EF Core today.
==
operator instead of using match()
rank
column instead of using bm25()
FromSql
and write the queries yourselffrom e in db.Emails
where e.Body == query
orderby e.Rank
select e;
@bricelam fts5 has a boatload of alternate/equivalent syntaxes you can use; one of them is to use the name of the table as a scalar function accepting a string. Shouldn’t it be possible to just map it as a scalar db function with that support recently in EF Core oob?
Eg
Select * from Table t
Join Search
On t.rowid = search.rowid
Where search(@keywords)
Order by rank ASC
In ADO.NET parlance. If you add the virtual Rank column (already doable) and the scalar function is as I suspect supported, then that makes things easier.
Support for NEAR sucks because it uses the ugly quoted string within a quoted string format, but perhaps there’s an alternative syntax there?
FYI, I finally wrote a blog post about SQLite Full-Text Search and EF Core.
order by bm25(email)
would order by bm25 also work for special letters like umlauts: äöü or something like ß?
After realizing that all these great features and "daydreams" have not become part of the latest efcore version, is there any announcement when (or if) one can expect this will part of a future version? Is it possible to up vote the features for upcoming versions?
@Kalle4242 Yes: 👍 on the issue. We will be publishing a plan for EF7 soon. See release planning for more info.
I have a problem calling a select/search statement with FromSqlRaw on a fts5 virtual table. I have two test projects, both using the same model, but I get different results calling the same statement:
protected override void Up(MigrationBuilder migrationBuilder)
{
migrationBuilder.Sql(@"CREATE VIRTUAL TABLE AssetSearch USING fts5(AssetCommonId UNINDEXED, AssetVersionId
UNINDEXED, AssetPartId UNINDEXED, Title, Body, Comment, Notice)");
}
public IQueryable<AssetSearchDbo> Search(string question)
{
string query =
"SELECT " +
" \"AssetSearch\".RowId " +
" , \"AssetSearch\".AssetCommonId " +
" , \"AssetSearch\".AssetVersionId " +
" , \"AssetSearch\".AssetPartId " +
" , snippet( \"AssetSearch\", 3, '>>', '<<', '...', 8 ) As Title " +
" , snippet( \"AssetSearch\", 4, '>>', '<<', '...', 8 ) As Body " +
" , snippet( \"AssetSearch\", 5, '>>', '<<', '...', 8 ) As Comment " +
" , snippet( \"AssetSearch\", 6, '>>', '<<', '...', 8 ) AS Notice " +
" , bm25(\"AssetSearch\") As Rank " +
" FROM \"AssetSearch\" " +
$" WHERE \"AssetSearch\" MATCH ' - {{{{AssetCommonId AssetVersionId AssetPartId}}}} : {question}' " +
" ORDER BY Rank";
IQueryable<AssetSearchDbo> searchResults = AssetSearch.FromSqlRaw<AssetSearchDbo>(query);
return searchResults;
}
When inspecting the searchResults in one case an enumeration of AssetSearchDbo objects is returned and the results of snippet and rank functions are not computed. The snippet fields contain the entire field values without tagging the "match term" and rank is null. In the context of the second test project an enumeration of Castle.Proxies.AssetSearchDboProxy objects with correct computed snippet and rank functions is returned. Furthermore in the second case the AssetSearch DbSet consist already of those proxy objects.
Can anybody imagine which difference makes it work either this or that way? How can I force the DbContext to use LazyLoading in first case as well (yes it is set in the OptionsBuilder)? Or, the other way round, what is going wrong in the first case, that lazyloading is not used there?
Maybe there is someone knowing more about the combination of efcore, lazy loading and fts5. By the way I'm working with VS2022 and Net6.0. Thanks.
I'm working on it myself. A difference between the two test project is, that in first on the context is kept open during the entire test. In the second one a new DbContext is created for each test method. I changed the first one in a way it creates a new DbContext before the search method is called. And this worked!!! Finding1: At least one prior action on the DbContext leaves it in a state, that it does not work properly in the described case. Maybe s.o. can guess, what this can be. Finding2: For the moment, there's work-around! Or maybe it is even recommendable to work that way. Maybe s.o. has an advice here.
Do you have any sample project on how to use FTS5 on SQLite EF .NET Core? I followed https://www.bricelam.net/2020/08/08/sqlite-fts-and-efcore.html but not sure if I understand correctly, I have few problems: there is FTSPost model which should create "normal" table, however we replace migration of this table with manual migration so the virtual table is created with rowid and Content column: CREATE VIRTUAL TABLE FTSPost USING fts5(Content); Now, the FTSPost table(FTSPost virtual table) doesn't have the properties needed to one to one mapping with Post table(which are defined in FTSPost model): class FTSPost { public int RowId { get; set; } public Post Post { get; set; } public string Content { get; set; } public string Match { get; set; } public double? Rank { get; set; } }
Can you share the migration and the data model created by this?
Sorry. I didn't manage to run Brice's example. It also didn't match my requirements. The search content of my solution is not in a database table. My database is a metadata database for different assets which our solution handles. The search data is extracted from the content of those assets and only should be separately stored to fts5 tables for full-text-search purpose. It is not required, so it is superfluous in the metadata database. As you can see I'm selecting my search hits via a FromSqlRaw call. This is closer to the example of VahidN https://github.com/VahidN/EFCoreSQLiteFTS But my fts5 virtual table representing object contains a reference to a table of the metadata database. I added it by accident, maybe Brice's example in mind. I did not add it to the modelbuilder in the OnModelCreating method and it workedf or me. Because it's a navigation property you should make it virtual public virtual Post Post {get; set; } Attention: If you add an additional automatic database migration, an index and a foreign key will be added to the up method and deleting them to the down method for this reference. You have to delete this statements. Leaving it in the migration, will overwrite the virual fts5 table with a conrete table with the same name. It took me hours to realize that is looking different in the DB Browser for SqLite, wondering why it is not working anymore. Hope this will help you.
Thanks, @Kalle4242. I looked at the Vahid's project and also your's suggestions in mind and I was able to create first working solution. So, I have manual migration to create virtual tables and used FromSQLRaw for querying them, that seems to be working. BTW, I'm using SQLIteStudio, it displays the letter 'V' when the table is vurtual, it seems that part is missing in DB Broeser.
SQLite's FTS3 and FTS4 (full-text search) extension modules allow users to create special tables with a built-in full-text index. This extension is usually enabled by default. (I checked: most platforms we support have a version of SQLite compiled with ENABLE_FTS3).
To leverage this, the create table syntax and query syntax are slightly different.
To enable this in EF, we would need to add some kind of configuration option to specific which tables are FTS and make query respond accordingly.
In the meantime, users can work around this with
MigrationBuilder.Sql
andFromSql
.More docs: https://www.sqlite.org/fts3.html