michaellavelle / spring-data-dynamodb

Simplifies the development of creating an Amazon DynamoDB-based Java data access layer
https://derjust.github.io/spring-data-dynamodb/
Apache License 2.0
169 stars 284 forks source link

Some troubles with sorting... #52

Closed trelorny closed 9 years ago

trelorny commented 9 years ago

Hi!

I am quite new to DynamoDB and Spring-Data so maybe my question is not very bright, but I already put almost 2 days in a problem that I cannot solve - although it seems to be very easy.

I just want to list products in the descending order of creationdate - the task is as simple as that. I tested around with the spring-data-dynamodb (I used the "Reply" as a pattern of what I want to do) and ended up having no errors when requesting DynamoDB, but I also did not get any data :(

Here is some code I ended up with: The Repository "ProductRepository"

public interface ProductRepository extends DynamoDBPagingAndSortingRepository<ProductItem, ProductId> {

    Page<ProductItem> findByCategoryAndCreationdateAfterOrderByCreationdateDesc(ProductId product,Date creationdate,Pageable pageable);

    @EnableScan
    @EnableScanCount
    Page<ProductItem> findByProductIdOrderByCreationdateDesc(ProductId id,Pageable pageable);      
}

Here are my annotated classes:

The Item of the table "products" - ProductItem


import java.util.ArrayList;
import java.util.List;

import org.springframework.data.annotation.Id;

import com.amazonaws.services.dynamodbv2.datamodeling.DynamoDBAttribute;
import com.amazonaws.services.dynamodbv2.datamodeling.DynamoDBHashKey;
import com.amazonaws.services.dynamodbv2.datamodeling.DynamoDBIgnore;
import com.amazonaws.services.dynamodbv2.datamodeling.DynamoDBIndexHashKey;
import com.amazonaws.services.dynamodbv2.datamodeling.DynamoDBIndexRangeKey;
import com.amazonaws.services.dynamodbv2.datamodeling.DynamoDBMarshalling;
import com.amazonaws.services.dynamodbv2.datamodeling.DynamoDBMarshaller;
import com.amazonaws.services.dynamodbv2.datamodeling.DynamoDBRangeKey;
import com.amazonaws.services.dynamodbv2.datamodeling.DynamoDBTable;

@DynamoDBTable(tableName = "products")
public class ProductItem extends Link {

    @Id
    private ProductId productId;

    private String owner;

    private String headline;
    private String description;

    private String subcategory;

    <further attributes>

    @DynamoDBHashKey(attributeName = "productId")
    @DynamoDBMarshalling(marshallerClass=ProductIdMarshaller.class)
    public ProductId getProductId() {
    return productId;
    }

    public String getCategory() {
    return productId != null ? productId.getCategory() : null;
    }

    @DynamoDBMarshalling(marshallerClass=ProductIdMarshaller.class)
    public void setProductId(ProductId id) {
    this.productId = id;
    }

    @DynamoDBRangeKey
    public String getCreationdate() {
    return productId != null ? productId.getCreationdate() : null;
    }

    public void setCreationdate(String creationdate) {
    if (productId == null) {
        productId = new ProductId();
    }
    productId.setCreationdate(creationdate);
    }
}

And the coposite ProductId which I am not sure wether I really need it:


import java.io.Serializable;
import java.util.Date;

import com.amazonaws.services.dynamodbv2.datamodeling.DynamoDBAutoGeneratedKey;
import com.amazonaws.services.dynamodbv2.datamodeling.DynamoDBHashKey;
import com.amazonaws.services.dynamodbv2.datamodeling.DynamoDBRangeKey;

public class ProductId implements Serializable {

    @DynamoDBHashKey
    @DynamoDBAutoGeneratedKey
    private String category;

    @DynamoDBRangeKey
    private String creationdate;

    public String getCategory() {
        return category;
    }

    public void setCategory(String category) {
        this.category = category;
    }

    public String getCreationdate() {
        return creationdate;
    }

    public void setCreationdate(String creationdate) {
        this.creationdate = creationdate;
    }
}

Said in SQL the only thing I need for now is something like SELECT * FROM articles ORDER BY creationdate DESC/ASC. Later I plan to add GSIs to perform other searches like "Attribute 'headline' starts with "Foo", but that is maybe another problem.

So, basically my questions is how is the setup for the table, the repository and the annotated class to have this working using spring-data-dynamodb? Am I trapped in the world of relational databases my brain cannot escape maybe?

Thank you very much in advance for your help - I am quite desperate :)

michaellavelle commented 9 years ago

Hi there

Thanks for your interest in spring-data-dynamodb.

For the use-case of returning all items in a table in a sorted order, I would question whether you would want to use DynamoDB as the solution for this, as it is very different from a relational database.

Returning all items in a table (without any limiting conditions) is an expensive operation in DynamoDb as I understand it, and would require a scan operation I believe. Sorting in DynamoDB is only supported under certain criteria, such as sorting by a range key or index range key, where all the items are for the same hash key.

If you're wanting to return all items from a table in order, I don't believe there is a good way to do this with DynamoDb that is efficient for large datasets.

Say your products were allocated a category, then it would be possible to search for all products within a category, ordered by creation date, if you set up a hash and range table with hash key=category and index range key = creation date, as you can sort all items that share a hash key, but you can't sort all items in the table without a hash key condition.

Hope this makes sense,

Cheers,

Michael

trelorny commented 9 years ago

Hi!

Thank you very much for the reply.

Of course I see, that any scanning operation is quite expensive in Dynamo (as in almost any other database system). So the goal would actually be to provide paginated acces or something like that.

But anyway, I found another solution - I used a Global Secondary Index (GSI) in Order to perform queries like "Get all the articles of a specific category ordered by creationdate". I danced around a lot with hash + range key on the table itself, but I could not make it fly (see my first post). GSI for a table with only a Hash Key actually suits my purpose quite well, but: I needed to make an adaption to the code of spring-data-dynamodb. Actually I would like to submit a pull request, because it is quite easy to use and describe the usage of GSIs, even on tables with only a Hash Key.

However, I do not know which kind of sideeffects this change has. So, these days I will provide a pull request if you like along with some description on how to use it. Actually if I had such a description, it would have saved me a tremendous amount of time - so I guess it is benefitial for future users of this very cool spring-data implementation!

michaellavelle commented 9 years ago

Hi - I'm glad you found a solution to your requirements. I'm confused though as Global Secondary Indexes are already supported by spring-data-dynamodb - what change did you need to make?

trelorny commented 9 years ago

Jep, GSIs were supported, but either I used them wrong or there was a Bug. When I used the repository method

Page<ArticleItem> findByCategoryAndCreationdateBeforeOrderByCreationdateDesc(String category, Date date, Pageable pageRequest);

I got a 'UnsupportedOperationException: Sorting for global index queries with criteria on both hash and range not possible'. So I thought to just remove the second Criteria and do it like that:

Page<ArticleItem> findByCategoryOrderByCreationdateDesc(String category, Date date, Pageable pageRequest);

but that one gave me the error that I can only Order by "category" (don't remember the correct exception description). So I just used the first one and commented the lines 153-157 in AbstractDynamoDBQueryCriteria.java:

//                  if (queryRequest.getKeyConditions().size() > 1) {
//                      throw new UnsupportedOperationException(
//                              "Sorting for global index queries with criteria on both hash and range not possible");
//
//                  }

Now the first call worked very good.

However, whenever I want to query a table it only allows for one criteria besides the "sorting-criteria". And as I understood the spring-data-dynamodb currently supports only querying by max two criteria, which means that a query like

Page<ArticleItem> findByCategoryAndSubcategoryAndCreationdateBeforeOrderByCreationdateDesc(String category, String subcategory, Date date, Pageable pageRequest);

does not work at the moment. Is that true or am I missing something?

trelorny commented 9 years ago

By the way here is my outline of the ArticleItem-Class which contains the information about the GSIs and the Hash-Key:

@DynamoDBTable(tableName = "articles")
public class ArticleItem extends Link {

    private String id; 
        @DynamoDBIndexHashKey(globalSecondaryIndexName ="owner-creationdate-index")
    private String owner; 

    @DynamoDBIndexHashKey(globalSecondaryIndexName ="category-creationdate-index")
    private String category;
    @DynamoDBIndexHashKey(globalSecondaryIndexName ="subcategory-creationdate-index")
    private String subcategory; 
    @DynamoDBIndexRangeKey(globalSecondaryIndexNames = {"category-creationdate-index","subcategory-creationdate-index", "owner-creationdate-index"})
    private Date creationdate; 

        // ...some more attributes...

    @DynamoDBHashKey(attributeName = "id")
    @DynamoDBAutoGeneratedKey
    public String getId() {
        return id;
    }

    @DynamoDBAttribute
    public String getSubcategory() {
        return subcategory;
    }

    public void setSubcategory(String subcategory) {
        this.subcategory = subcategory;
    }

    @DynamoDBAttribute
    public Date getCreationdate() {
        if(creationdate == null){
        creationdate = new Date();
        }
        return creationdate;
    }

    public void setCreationdate(Date creationdate) {
        this.creationdate = creationdate;
    }

       @DynamoDBAttribute
    public String getCategory() {
        return category;
    }

    public void setCategory(String category) {
        this.category = category;
    }

        @DynamoDBAttribute
    public String getOwner() {
        return owner;
    }

    public void setOwner(String owner) {
        this.owner = owner;
    }
        //...other getters & setters...

On my table articles I had to add these 3 GSIs: owner-creationdate-index, category-creationdate-index, subcategory-creationdate-index. The good thing about it is, that I can provide more throughput to the more often used GSIs for categories & subcategories and a little less for the article of a owner.

My repository now looks like that:

@EnableScan
@EnableScanCount
public interface ArticleRepository extends DynamoDBPagingAndSortingRepository<ArticleItem, String> {

    Page<ArticleItem> findByCategoryAndCreationdateBeforeOrderByCreationdateDesc(String category, Date date, Pageable pageRequest);

    Page<ArticleItem> findBySubcategoryAndCreationdateBeforeOrderByCreationdateDesc(String subcategory, Date date, Pageable pageRequest);    

    Page<ArticleItem> findByOwnerAndCreationdateBeforeOrderByCreationdateDesc(String ownerId, Date date, Pageable pageRequest);

}
michaellavelle commented 9 years ago

I think you've identified an issue with the defensive checks, which hopefully I've now fixed.

In general, it isn't possible to request sort order when both index hash and range key conditions are specified, as the data may span multiple hash keys, and sort order is undefined.

However, if the criteria on index hash key is equality, then this should be possible - looks like the defensive checks weren't taking this into account.

I'm not in the position to try this against a real dynamodb instance as I write this - but I hope the fix gives you the functionality you need. If you get chance, perhaps you could try this latest code against your query and let me know if it works for you?

Many thanks for raising this.

Cheers,

Michael

bbradley72 commented 9 years ago

Hi Michael,

Thanks for creating this project.

I am relatively new to dynamodb and spring data and am facing an issue with sorting and pagination. Hopefully, you can point in the right direction for what I am doing wrong.

I have created the following classes/interfaces:

Domain: @DynamoDBTable(tableName = "Brand") public class Brand {

@Id
@GeneratedValue(strategy=GenerationType.AUTO)
private String id;
private String name;
private String lowercaseName;
private String lowercaseLetter;

@DynamoDBHashKey
@DynamoDBAutoGeneratedKey
public String getId() {
    return id;
}
public void setId(String id) {
    this.id = id;
}

@DynamoDBAttribute
public String getName() {
    return name;
}
public void setName(String name) {
    this.name = name;
}

@DynamoDBAttribute
public String getLowercaseName() {
    return lowercaseName;
}
public void setLowercaseName(String lowercaseName) {
    this.lowercaseName = lowercaseName;
}

@DynamoDBAttribute
public String getLowercaseLetter() {
    return lowercaseLetter;
}
public void setLowercaseLetter(String lowercaseLetter) {
    this.lowercaseLetter = lowercaseLetter;
}

}

Repository: @EnableScan @EnableScanCount public interface BrandRepository extends DynamoDBPagingAndSortingRepository<Brand, String> { Page findByLowercaseLetterOrderByLowercaseNameDesc(String lowercaseLetter, Pageable pageable); }

I would like to receive 20 brands per page (where lowercase letter is equal to the letter passed into the query) and have them ordered by lowercase name

My Controller has this call: PageRequest pageRequest = new PageRequest(0,20); List brands = (List) brandRepository.findByLowercaseLetterOrderByLowercaseNameDesc(letter, pageRequest);

I have created a GSI with the Hash Key of lowercaseLetter and Range Key of lowercaseName

However, when I run this code, I receive the following error: java.lang.UnsupportedOperationException: Sort not supported for scan expressions at org.socialsignin.spring.data.dynamodb.repository.query.DynamoDBEntityWithHashKeyOnlyCriteria.buildScanExpression(DynamoDBEntityWithHashKeyOnlyCriteria.java:95) at org.socialsignin.spring.data.dynamodb.repository.query.DynamoDBEntityWithHashKeyOnlyCriteria.buildFinderQuery(DynamoDBEntityWithHashKeyOnlyCriteria.java:65) at org.socialsignin.spring.data.dynamodb.repository.query.AbstractDynamoDBQueryCriteria.buildQuery(AbstractDynamoDBQueryCriteria.java:431) at org.socialsignin.spring.data.dynamodb.repository.query.DynamoDBQueryCreator.complete(DynamoDBQueryCreator.java:33) at org.socialsignin.spring.data.dynamodb.repository.query.DynamoDBQueryCreator.complete(DynamoDBQueryCreator.java:12)

I think my problem is with the way I have the GSI configured, but I am not sure what I did wrong. Can you provide any assistance on why this is running a scan instead of a query and how to get the results I'm trying to achieve?

michaellavelle commented 9 years ago

Hi

Thanks for your interest in the project. You should be able to perform the query you need by annotating your domain class with the @DynamoDBIndexHashKey and @DynamoDBIndexRangeKey annotations.

If you add @DynamoDBIndexHashKey(globalSecondaryIndexName="yourIndexName") to the getter for lowercase letter, and @DynamoDBIndexRangeKey(globalSecondaryIndexName="yourIndexName") to the getter for lowercase name, this should allow the query to execute.

Hope this helps,

Michael

bbradley72 commented 9 years ago

Hi Michael,

Thanks for helping me with this. That was exactly what I needed. It worked perfectly.

nilaytiwari commented 7 years ago

It is the great conversation , i have learned a lot.

nilaytiwari commented 7 years ago

How we can find a single item with maximum range key