datanucleus / datanucleus-cassandra

DataNucleus support for persistence to Cassandra datastores
9 stars 8 forks source link

Support LIKE queries for Cassandra with SASI #33

Closed yhilem closed 5 years ago

yhilem commented 5 years ago

Hi, Since Cassandra 3.4, LIKE queries can be achieved using a SSTable Attached Secondary Index (SASI). JPQL supports the SQL LIKE operator to provide a limited form of string pattern matching. Is this feature supported with SASI? Thanks Y. HILEM

andyjefferson commented 5 years ago

An issue tracker is not the place to ask a QUESTION, that is for a support forum. Never heard of "SASI" so i haven't implemented such a thing. Hence, since this is open source, you could easily contribute support for it

yhilem commented 5 years ago

SASI indexes are calculated and stored as part of each SSTable file, differing from the original Cassandra implementation, which stores indexes in separate, “hidden” tables.

CREATE CUSTOM INDEX user_last_name_sasi_idx ON user (last_name)
 USING 'org.apache.cassandra.index.sasi.SASIIndex';

SASI indexes do offer functionality beyond the traditional secondary index implementation, such as the ability to do inequality (greater than or less than) searches on indexed columns. You can also use the new CQL LIKE keyword to do text searches against indexed columns. For example, you could use the following query to find users whose last name begins with “N”: SELECT * FROM user WHERE last_name LIKE 'N%';

Normally if I create the index I can query with a LIKE clause with CQL JPA Native Queries (http://www.datanucleus.org/products/accessplatform_4_2/jpa/native_query.html). From (http://www.datanucleus.org/documentation/news/access_platform_4_0.html), Version 4.0 includes the following over 3.2/3.3:

Thanks Youcef HILEM

andyjefferson commented 5 years ago

As already said, if you want support for such a feature then you need to implement it.

yhilem commented 5 years ago

Okay, but I have to understand before implementing what's missing. I first test CQL Native Queries supported by Datanucleus.

yhilem commented 5 years ago

My project (https://github.com/linkedin/ambry/issues/555#issuecomment-466749576): use S3Proxy (https://github.com/gaul/s3proxy/wiki/Using-S3Proxy-in-Java-projects) with jclouds-jdbc (https://github.com/jclouds/jclouds-labs/tree/master/jdbc) and ambry-linkedin (https://github.com/linkedin). S3Proxy-JClouds-Cassandra-Ambry

For jclouds-jdbc I want to use DataNucleus JPA with Cassandra. That's why I'm trying to deal with this query with a Like clause (https://github.com/jclouds/jclouds-labs/blob/master/jdbc/src/main/java/org/jclouds/jdbc/repository/BlobRepository.java#L45):

public List<BlobEntity> findBlobsByDirectory(ContainerEntity containerEntity, String directory) {
      return entityManager.get().createQuery("SELECT b FROM " + entityClass.getName() + " b "
            + "WHERE b.containerEntity = :containerEntity AND b.key != :directoryName AND b.key LIKE :directoryLike ", entityClass)
            .setParameter("containerEntity", containerEntity)
            .setParameter("directoryName", directory)
            .setParameter("directoryLike", directory + "%")
            .getResultList();
   }

I do not know the internal architecture of DataNeuclus for Cassandra. Could you explain to me what I need to do to implement this feature?

Thanks Youcef HILEM

andyjefferson commented 5 years ago

All code has comments, so delve into it. JPQLQuery controls the basic conversion of JPQL into CQL. QueryToCQLMapper does the conversion. Any query has a "generic" compilation, shown in the log. QueryToCQLMapper converts that into a "datastore" compilation (aka CQL). That's all there is to know. Look at the generic compilation that your query is compiled to (for the LIKE clause) and then find the equivalent place in the QueryToCQLMapper (or superclass stub) that handles that type, and implement it.

yhilem commented 5 years ago

Thank you. To create the SASI index (https://docs.datastax.com/en/dse/5.1/cql/cql/cql_using/useSASIIndex.html), I would also need to add an index annotation to the following attribute (https://github.com/jclouds/jclouds-labs/blob/master/jdbc/src/main/java/org/jclouds/jdbc/entity/BlobEntity.java#L46):

   @Id
   private String key;

Sorry but I'm stuck on that too. At worst I will create the SASI index by an external script.

andyjefferson commented 5 years ago

JPA already has @Index annotations, supported, as per CassandraSchemaHandler https://github.com/datanucleus/datanucleus-cassandra/blob/master/src/main/java/org/datanucleus/store/cassandra/CassandraSchemaHandler.java#L1078. But that is not this "issue" you have raised here, so nothing to do with this

yhilem commented 5 years ago

Thank you. There I have all the information. I will submit a PR as soon as it is ready.

andyjefferson commented 5 years ago

No testcase demonstrating anything, and no pull request so closing. Can be reopened if complying with basic rules of this project, or if contributing a PR for whatever this is