Netflix / astyanax

Cassandra Java Client
Apache License 2.0
1.04k stars 354 forks source link

Zero development activity #399

Closed kamenik closed 11 years ago

kamenik commented 11 years ago

wake up, guys;)

opuneet commented 11 years ago

@kamenik

Sorry for the lapse in activity. Actually, there's a lot going on in Astyanax here at Netflix. We are considering porting the entire interface over to use the java-driver which is cql based and supports async operations.

As you may know, the astyanax interface is huge! And porting over everything to use CQL3 is a non-trivial task requiring a lot of effort. But the end result should be very useful to users where they can get best of both worlds, which is all the recipes, the fluent interface and async operations and all the retry policies like the downgrading consistency policy et all.

Currently we have very limited staff (ahem ... just me) working on Astyanax, hence balancing this work with the 100 odd issues is hard. But more troops are on the way :)

Anyways, is there any specific issue that you would like to pay attention to?

zznate commented 11 years ago

As someone with a passing familiarity with the underlying issues... :)

Have you bench tested thrift in 2.0 w/ Pavel's HSHA disruptor-based transport? (https://issues.apache.org/jira/browse/CASSANDRA-5582)

Despite what vendors are saying, I've seen several real world examples recently that make me believe it will require substantial effort to get CQL on-par performance wise given the amount of object overhead in statement creation and parsing (even with stored procedures). Eg. have you tried doing a real workload with inserts via CQL? It's a complete non-starter (even with batching/batch changes in 2.0).

You folks have my contact info. Happy to talk about this if you want to get into details.

opuneet commented 11 years ago

@zznate

I'm keeping the interface backwards compatible with the existing thrift based driver. And no, I haven't run the benchmark tests as yet. I wanted to do this when I had some code working with the java-driver, since I would have to benchmark the Astyanax lib as well (on top of the java-driver)

I'm closing this issue out, but yes will definitely keep in touch and reach out for details. Thanks!

zznate commented 11 years ago

@jasobrown dude, seriously? Please do a perf test before you folks burn effort on this. Would be curious what @vijay2win and his crew thought of this as well given @xedin's refactor. Sorry to name drop, but there are a lot of shops betting on this - API compat or no - who have had bad experiences with CQL already.

opuneet commented 11 years ago

@zznate

I think I closed the issue pre-maturely, sorry about that.

Yes we will absolutely be testing and benchmarking this. I wouldn't want to have all of Netflix switch over to this and then kill everyone's throughput in production :)

I just had a quick chat with @jasobrown and we are discussing shifting priorities for benchmarking sooner than later. We just need some basic code out there that does some simple inserts and reads so that we have something to stress test. What I'm pointing out here is that I need to also test the Astyanax layer over the CQL layer as well, since that will be the layer of indirection when using the lib and hence has potential for making things worse.

Thank you for the heads up.

zznate commented 11 years ago

I'm happy to help with this if you need some extra hands. We frequently are in situations where we can test with varying workloads. Let me know.

Vijay2win commented 11 years ago

:+1: CQL is good as a language but we do get a better performance via thrift. Not sure when we will switch, but it might be nice to keep this project alive :)

kamenik commented 11 years ago

Thank you for response. Understand that it is hard with such BIG team;)

Must say I worry about CQL performance too, we use big mutations, mixed inserts/updates/deletes in one batch.

If you can add support for CAS ops, it would be very helpful.

boneill42 commented 11 years ago

Our teams are slowly getting their feet wet with CQL, targeting it for use in non-critical aspects of the system. We don't have many data points for performance yet. But honestly, performance is only one aspect motivating our migration. I'm just as concerned about feature/functional differences developing between CQL and Thrift.

In our PoC's, we are (re)evaluating the different abstraction layers. Since CQL takes a much more static approach to schema, does it make sense to rely more on ORM/JPA layers? For example, a traditional wide-row in thrift may become a Set in CQL. If that is the case, is an annotated member variable the right abstraction? (Are we sacrificing "NoSQL power" with that abstraction?)

Thus, transitioning Astyanax to CQL may have implications that go up the stack into the actual API. For now, we could swap out thrift for CQL, and since the user won't be impacted it comes down to non-functionals like performance, but it may be worth considering "What does the Astyanax API look like in the future once it is riding on CQL?".

just food for thought. (we are long time Hector/Astyanax fans, but in light of CQL, we find ourselves looking at Spring Data and Kundera)

opuneet commented 11 years ago

Point taken. I've looked at some really basic uses cases which include

  1. Simple reads - single row queries
  2. Simple writes - single row writes (put / add / delete columns)
  3. Batch mutations
  4. Paginating over wide rows for a single row
  5. Multiple kinds of column slice queries and col range queries
  6. Multiple kinds of row slice queries and row range queries

Most of this translates pretty well to the Astyanax api. I've seen problems where the columns are actually named columns instead of key, column1, column2 ..... , value and that makes the queries really confusing. Also noticed that I can't do row pagination yet, since I need cursor support from the java which is present only in their beta release.

Anyways, the gist is that most simple use cases seem to fit well with the Astyanax api. But then there were legit comments by other users (see Nate's remarks above) that are pretty crucial. To put it another way .... I would not want to announce the intentional release of a brand new Astyanax version over CQL which is less performant. Hence I'm focussing efforts on testing Astyanax with CQL at the moment.

I will communicate numbers as soon as I have something noteworthy.

zznate commented 11 years ago

Thanks for the update @opuneet. Fyi - I put together some details that might be helpful for conversion while working on a client's project: http://thelastpickle.com/blog/2013/09/13/CQL3-to-Astyanax-Compatibility.html

I'm currently waiting on permission to from said client to publish a detailed case with numbers for mutations (the only place where there is a serious performance difference).

bjanssen1 commented 11 years ago

We're also trying to get a good handle on what to do here. We started using Cassandra about 8 months ago before CQL3 was recommended for use and adopted Astyanax. We've loved using it so far but have a lot of concerns about long term support for Thrift and the performance and limitations of the CQL3 abstraction.

The above article is much appreciated. Are there any other good resources for using Astyanax and CQL3 in conjunction? If we start building up our system and storing lots of data via Thrift, is there a clear conversion path for migration to CQL3 later?

Like others in this thread, we do a lot of mixed updates, inserts, and deletes of cells within the same row. Part of our use of the system is based on the assumption that if we batch these operations together, we are guaranteed that they all happen atomically (no chance of a temporary partially done state). Based on my understanding of CQL3 batching, this guarantee can't really be achieved any more unless I'm missing something.

If anyone has good sources of information on any of these topics, it would be much appreciated.

zznate commented 11 years ago

DataStax has committed to long term support for thrift on their enterprise products. More importantly, since Cassandra is a community driven ASF project, the community actually has the final say here, not DataStax product managers.

In short, use what works for you. If you are happy with Astyanax - most folks who use it are - then keep going forward there.

ghost commented 11 years ago

:+1:

opuneet commented 11 years ago

Regarding thrift support (within Astyanax), there is no plan within Netflix to remove it or even deprecate it. Hence all the thrift related ops that you perform with Astyanax will all be supported.

I'm currently reviewing if most of these operations can be supported with CQL as well.

@zznate Nice post. Mind if I reference this from the Astyanax wiki itself :)

zznate commented 11 years ago

@opuneet Cool - appreciate the roadmap clarification. And thanks! Please feel free to reference that post from the wiki.

lc-nyovchev commented 11 years ago

@opuneet in the last few weeks i have also been dabbling with cql3 support on astyanax. While I could make a lot of existing thrift ops possible with cql3, I have been wondering if supporting both with a unified syntax would be a reasonable solution. Take a look at this:

https://groups.google.com/forum/#!topic/astyanax-cassandra-client/ujnhsoM6xtI

This is a very basic example in which thrift and cql3 differ. And while it would be indeed possible to play around with the cf metadata and infer stuff from it, so the result data looks the same, it would be a very hacky solution. I am glad that more cql3 support hopefully with native transport is on the road map, but I am double happier that thrift support wouldn't be dropped.

Also, from my very limited experience, cql3 heavy operations, even with the stored procedures optimization, perform a lot, like a lot worse than their thrift equivalents. Again, maybe my use cases were limited, but that is what I see currently and that is why my team is hesitant to even consider a full on cql3 migrate.

opuneet commented 11 years ago

Supporting everything that thrift does in cql3 and making it all live under a unified syntax is actually quite hard and not really worth solving.

Currently I'm seeing differences between the 2 in terms of features, model, syntax and performance and it's almost impossible for me to keep the entire Astyanax interface happy when someone wants to seamlessly switch over from thrift to cql3 / java-driver. I'm looking at simpler cases right now (like what we have in Netflix) w.r.t schema model, design etc, and I'm able to do a decent amount with Astyanax over java-driver actually. I also got composite columns to work.

There are going to be several caveats with the new Astyanax release, and we can incrementally work towards bridging the gap between thrift features and all that java-driver supports.

Also regarding performance, I've run some numbers and yes java-driver was slower than Astyanax, but not by much actually (5%-10%) I've reached out to Sylvain who can help me look at these numbers, my test driver, config, tuning etc.

opuneet commented 11 years ago

BTW, one thing to note ... reusing prepared statements was the way to go. If you don't do that, performance will be really bad. Here is a simple example

        Session session = null;
        boolean stopped = false;
        Random random = new Random();
        PreparedStatement pStmt = session.prepare("select * from table where key = ?");

        while (!stopped) {
            int rowKey = random.nextInt(10000);
            BoundStatement bStmt = pStmt.bind(rowKey);
            session.execute(bStmt);
        }

        /** IS VERY DIFFERENT FROM */

        while (!stopped) {
            int rowKey = random.nextInt(10000);
            session.execute("select * from table where rowKey = " + rowKey);
        }
lc-nyovchev commented 11 years ago

@opuneet While this will indeed give some performance enhancement, I was talking broadly about cql3, not in Astyanax context. If you have a hadoop job, for example, one where you just insert stuff with thrift, and one that inserts stuff with cql using a prepared statement, the one with thrift outperforms the one with cql3 (tested on a cluster with a RandomPartitioner) with C* 1.2.10. I am gonna agree with @zznate that in the current stable version of cassandra cql3 is still not what it should be in terms of performance and tools integration.

Seems like everybody is rushing into cql3 now, but yeah, things there change pretty fast, maybe it would be very hard, near impossible, to provide equivalent cql and thrift Astyanax syntax. After all, white the internal storage is the same, the cql way of thinking and representing data is very different from what we used to do in the past with thrift, so as far as I am concerned, it would be OK to treat cql and thrift differently API wise.

zznate commented 11 years ago

@opuneet out of curiosity, have you tried those performance comparisons with the disruptor-based HSHA transport on 2.0.x?

opuneet commented 11 years ago

@zznate nope, haven't had time to try this out yet.