datasalt / splout-db

A web-latency SQL spout for Hadoop.
50 stars 14 forks source link

Maximum failed records #11

Open pereferrera opened 11 years ago

pereferrera commented 11 years ago

It could be nice to have an API feature that would allow for some records to fail when being exported to Splout SQL. When working with big data sets, this can be a common and desirable thing to have. Google BigQuery offers this possibility.

The idea would be to keep a counter of failed records and throw an Exception if there are more. The problem is actually how to do it, since we can only keep track of the failed records within a Mapper.

ivanprado commented 11 years ago

Also, it would needed a log were all failures are logged. That can be used to show the user which data has failed and why.

2013/1/29 Pere Ferrera notifications@github.com

It could be nice to have an API feature that would allow for some records to fail when being exported to Splout SQL. When working with big data sets, this can be a common and desirable thing to have. Google BigQuery offers this possibility.

The idea would be to keep a counter of failed records and throw an Exception if there are more. The problem is actually how to do it, since we can only keep track of the failed records within a Mapper.

— Reply to this email directly or view it on GitHubhttps://github.com/datasalt/splout-db/issues/11.

Iván de Prado CEO & Co-founder www.datasalt.com