Closed uladzimir-shelhunou closed 8 years ago
This is a known issue because Apache Commons' pool implementation is not serializable.
Here's my comment from the link above:
This serialization issue is similar to the issue described at https://github.com/dbpedia/distributed-extraction-framework/issues/9. It looks as if one would need to use a different object pool implementation than the one from Apache Commons Pool, as the latter may be hard to serialize (see sparkContext broadcast JedisPool not work).
Furthermore I noticed that the serialization issue is also triggered locally (e.g. when running ./sbt test
in kafka-storm-starter) when using Spark 1.2+.
To fix this issue you need to implement your own, serializable pool.
I understand why it happens, but I did't know that is known issue. I want to implement own pool, but I have some concerns about partitions and offsets.
I did something like the following for Java:
import static lombok.AccessLevel.PRIVATE;
import java.io.Serializable;
import java.util.NoSuchElementException;
import lombok.Getter;
import lombok.NonNull;
import lombok.RequiredArgsConstructor;
import lombok.experimental.Accessors;
import org.apache.commons.pool2.ObjectPool;
@RequiredArgsConstructor
public abstract class LazySerializableObjectPool<T> implements ObjectPool<T>, Serializable {
@NonNull
@Getter(lazy = true, value = PRIVATE)
@Accessors(fluent = true)
private final transient ObjectPool<T> delegate = createDelegate();
protected abstract ObjectPool<T> createDelegate();
@Override
public T borrowObject() throws Exception, NoSuchElementException, IllegalStateException {
return delegate().borrowObject();
}
@Override
public void returnObject(T obj) throws Exception {
delegate().returnObject(obj);
}
@Override
public void invalidateObject(T obj) throws Exception {
delegate().invalidateObject(obj);
}
@Override
public void addObject() throws Exception, IllegalStateException, UnsupportedOperationException {
delegate().addObject();
}
@Override
public int getNumIdle() {
return delegate().getNumIdle();
}
@Override
public int getNumActive() {
return delegate().getNumActive();
}
@Override
public void clear() throws Exception, UnsupportedOperationException {
delegate().clear();
}
@Override
public void close() {
delegate().close();
}
}
@RequiredArgsConstructor
public class KafkaProducerObjectPool extends LazySerializableObjectPool<...> {
/**
* Configuration.
*/
@NonNull
private final Map<String, String> producerProperties;
@Override
protected ObjectPool<KafkaProducer> createDelegate() {
val pooledObjectFactory = ...
val maxNumProducers = 10;
val poolConfig = new GenericObjectPoolConfig();
poolConfig.setMaxTotal(maxNumProducers);
poolConfig.setMaxIdle(maxNumProducers);
return new GenericObjectPool<KafkaProducer>(pooledObjectFactory, poolConfig);
}
}
See https://projectlombok.org/features/GetterLazy.html for details.
@btiernay thanks for sharing your solution, it really helped me out.
Hello
I tried made the same pool of the producers in my spark application, but I got this error when spark trying to broadcast pool.
Could you please comment this issue?