orientechnologies / orientdb

OrientDB is the most versatile DBMS supporting Graph, Document, Reactive, Full-Text and Geospatial models in one Multi-Model product. OrientDB can run distributed (Multi-Master), supports SQL, ACID Transactions, Full-Text indexing and Reactive Queries.
https://orientdb.dev
Apache License 2.0
4.74k stars 871 forks source link

After update of a document values are missing in FULLTEXT index #8602

Closed EricSchreiner closed 3 years ago

EricSchreiner commented 5 years ago

OrientDB Version: 3.0.8, 3.0.9

Java Version: <1.8>

OS: <Windows 10>

Expected behavior

After changing a value in a document it seems that values disappear from the Index. I've attached a unittest to reproduce. This seems to be new. Before orientdb 3.x we did not have that problem.

Actual behavior

Search for "mytesttag" in Index select from index:idxkeywords where key = ? (?=mytesttag) Number of documents found in index: 1

Now Update Document Found indexDocument: {key:mytesttag,rid:#25:44} Docid=44 Found document: test#25:44{name:Test44,keywords:keywordone44tag tagone44tag tagtwo44tag mytesttag} v1 Updated document: test#25:44{name:Test44,keywords:keywordone44 tagone44 tagtwo44 mytesttag onemorekeyword} v2

Now search again select from index:idxkeywords where key = ? (?=mytesttag) Number of documents found in index: 0 THIS SHOULD BE ONE AND NOT ZERO

Steps to reproduce

` package de.contecon.picapport.db;

import java.io.File; import java.util.List;

import org.junit.After; import org.junit.Assert; import org.junit.Before; import org.junit.Test;

import com.orientechnologies.orient.core.db.ODatabasePool; import com.orientechnologies.orient.core.db.ODatabaseSession; import com.orientechnologies.orient.core.db.ODatabaseType; import com.orientechnologies.orient.core.db.OrientDB; import com.orientechnologies.orient.core.db.OrientDBConfig; import com.orientechnologies.orient.core.id.ORecordId; import com.orientechnologies.orient.core.metadata.schema.OClass; import com.orientechnologies.orient.core.metadata.schema.OClass.INDEX_TYPE; import com.orientechnologies.orient.core.metadata.schema.OType; import com.orientechnologies.orient.core.record.ORecord; import com.orientechnologies.orient.core.record.impl.ODocument; import com.orientechnologies.orient.core.sql.executor.OResultSet; import com.orientechnologies.orient.core.sql.query.OSQLSynchQuery;

public class TestOrientDbIndexUpdate { private int insertcount; private String databaseName= this.getClass().getSimpleName(); private OrientDB orientDB; private ODatabasePool dbPool;

@Before public void setUp() throws Exception { String databaseName= this.getClass().getSimpleName(); orientDB = new OrientDB("embedded:./playground/target/databases/", OrientDBConfig.defaultConfig()); orientDB.create(databaseName, ODatabaseType.PLOCAL); ODatabaseSession dbs = orientDB.open(databaseName,"admin","admin"); insertcount = 100; OClass test = dbs.getMetadata().getSchema().createClass("test"); dbs.command("ALTER CLASS test CLUSTERSELECTION default"); // This is very imortant for us test.createProperty("name", OType.STRING); test.createProperty("keywords", OType.STRING).setMandatory(false); test.createIndex("idxkeywords", INDEX_TYPE.FULLTEXT, "keywords"); dbs.close(); dbPool = new ODatabasePool(orientDB, databaseName,"admin","admin"); }

@After public void tearDown() throws Exception { //drop dbPool.close(); orientDB.drop(databaseName); File dbFolder = new File("./playground/target/databases/"+ databaseName); Assert.assertEquals(false, dbFolder.exists());
}

@Test public void testOdbIdexUpdate1() { ODatabaseSession dbs = dbPool.acquire(); fillDb(dbs, insertcount); dbs.close(); // Search Document testExistenceOfTag("mytesttag"); updateKeyword("mytesttag", "keywordone44 tagone44 tagtwo44 mytesttag onemorekeyword"); testExistenceOfTag("mytesttag"); // <---- Test fails here !!!!!! }

private void testExistenceOfTag(String tag) { ODatabaseSession dbs = dbPool.acquire(); System.out.println("select from index:idxkeywords where key = ? (?="+tag+")"); List result =dbs.command(new OSQLSynchQuery("select from index:idxkeywords where key = ?")).execute(tag); System.out.println("Number of documents found in index: "+result.size()); Assert.assertEquals(1, result.size());
dbs.close(); }

private void updateKeyword(String tag, String newKeyword) { ODatabaseSession dbs = dbPool.acquire(); List result =dbs.command(new OSQLSynchQuery("select from index:idxkeywords where key = ?")).execute(tag); Assert.assertEquals(1, result.size());
// Search document in index ODocument oDocumentIndex = result.get(0); int docId = getDocId(oDocumentIndex); System.out.println("Found indexDocument: "+oDocumentIndex.getRecord().toString()+ " Docid="+docId); // load original Document aud update int clusterId = dbs.getClusterIdByName("test"); ORecordId oRecordId = new ORecordId(clusterId, docId); ODocument oDocumentTest = dbs.load(oRecordId); System.out.println("Found document: "+oDocumentTest); oDocumentTest.field("keywords", newKeyword); ORecord saved = dbs.save(oDocumentTest); System.out.println("Updated document: "+oDocumentTest); dbs.close(); }

private void fillDb(ODatabaseSession dbs, int count) { for (int i = 0; i < count; i++) { String keywords="keywordone"+i +"tag tagone"+i+ "tag tagtwo"+i+"tag"; if(i==44) { keywords+=" mytesttag"; } ODocument doc = new ODocument("test"); doc.field("name", "Test" + i); doc.field("keywords", keywords); ORecord saved = dbs.save(doc); System.out.println("Insert document: "+saved); } OResultSet result = dbs.command("select * from test"); Assert.assertEquals(count, result.elementStream().count()); result.close(); }

private int getDocId(ODocument iDoc) { String sid = iDoc.getRecord().toString(); int id=parsePid(sid, 1); return id; }

private final int parsePid(String recId, int rt) { return string2int(recId.substring(recId.lastIndexOf(':')+1, recId.length()-rt)); }

private final int string2int(String in) { int i=0; try { i=Integer.parseInt(in.trim()); } catch(Exception e) { i=0; }

return i;
}

}

`

EricSchreiner commented 5 years ago

Hi @luigidellaquila, for me it seems that @tglman is currently not active. We use orientdb since version1 in our free photoserver picapport(www.picapport.de) . The described problem above causes us a lot of trouble at the moment. It would be great if someone can have a look at it. I think this is really a serious bug.

EricSchreiner commented 5 years ago

odb-document odb-indexok odb-indexproblem

EricSchreiner commented 5 years ago

Hi @luigidellaquila, hi @tglman, problem still exists wit 3.0.9

wolf4ood commented 5 years ago

Hi @EricSchreiner

i've reproduce this issue and trying to fix this bug

Thanks

EricSchreiner commented 5 years ago

Hi @wolf4ood thank you. Let me know if you need help or more information.

EricSchreiner commented 5 years ago

Hi @wolf4ood, hi @luigidellaquila , hi @tglman any news on this. The issue has not even been tagged as bug or solved ???????

lvca commented 5 years ago

@EricSchreiner, have you tried using LUCENE index for full-text instead of the internal one? It's way more stable and fast than the (old) internal one.

EricSchreiner commented 3 years ago

Has this issue been fixed? If yes in which release?

andrii0lomakin commented 3 years ago

Hi @EricSchreiner we do not support Full-Text index anymore for a long time because of many issues with it. Please use the Lucene index for such cases.