Azure / azure-data-lake-store-java

Microsoft Azure Data Lake Store Filesystem Library for Java
Other
20 stars 36 forks source link

The retry count of ExponentialBackoffPolicy created by ADLFileInputStream is not configurable #29

Open wheezil opened 5 years ago

wheezil commented 5 years ago

We occasionally encounter errors under heavy load where all 5 tries are exhausted:

com/microsoft/azure/datalake/store/ADLFileInputStream.read:com.microsoft.azure.datalake.store.ADLException: Error reading from file [filename]
Operation OPEN failed with HTTP429 : ThrottledException
Last encountered exception thrown after 5 tries. [HTTP429(ThrottledException),HTTP429(ThrottledException),HTTP429(ThrottledExceptio\
n),HTTP429(ThrottledException),HTTP429(ThrottledException)]
 [ServerRequestId: redacted] 
com.microsoft.azure.datalake.store.ADLStoreClient.getExceptionFromResponse(ADLStoreClient.java:1179) 
com.microsoft.azure.datalake.store.ADLFileInputStream.readRemote(ADLFileInputStream.java:252)     com.microsoft.azure.datalake.store.ADLFileInputStream.readInternal(ADLFileInputStream.java:221)     com.microsoft.azure.datalake.store.ADLFileInputStream.readFromService(ADLFileInputStream.java:132) 
com.microsoft.azure.datalake.store.ADLFileInputStream.read(ADLFileInputStream.java:101)

The readRemote() method uses the default ctor for new ExponentialBackoffPolicy() and there doesn't seem to be any way to specify more retries or a steeper backoff. In our use case, we have tasks in Hadoop running in parallel, and they apparently overwhelm the default backoff strategy.

rahuldutta90 commented 5 years ago

@wheezil we had a internal pr regarding this but could not get to it. I saw you pr, left a few comments.