Open zuston opened 1 year ago
We could use more MEMORY_LOCALFILE_HDFS, HDFS have more capacity to process the problems.
We could use more MEMORY_LOCALFILE_HDFS, HDFS have more capacity to process the problems.
Yes, I think so. But users like VIP shop hope this optimization could be applied in local disk. Could you help ping? Let's discuss more.
@xumanbu @Gustfh
We could use more MEMORY_LOCALFILE_HDFS, HDFS have more capacity to process the problems.
Yes, I think so. But users like VIP shop hope this optimization could be applied in local disk. Could you help ping? Let's discuss more.
use hdfs can solve capacity issues, but hdfs write rate is slower than ssd, the memory will be full quickly in large data written. in my mind if we use hdfs storage,uniffle client should have backpressure to limit client write rate.
We could use more MEMORY_LOCALFILE_HDFS, HDFS have more capacity to process the problems.
Yes, I think so. But users like VIP shop hope this optimization could be applied in local disk. Could you help ping? Let's discuss more.
use hdfs can solve capacity issues, but hdfs write rate is slower than ssd, the memory will be full quickly in large data written. in my mind if we use hdfs storage,uniffle client should have backpressure to limit client write rate.
Yes, it does. Please see https://github.com/apache/incubator-uniffle/blob/master/docs/server_guide.md#huge-partition-optimization
Code of Conduct
Search before asking
What would you like to be improved?
In #378, we have discussed the huge partition problems. And the final solution is to make it flush to HDFS directly and limit memory usage.
But for some users only using
MEMORY_LOCALFILE
storage type, it's useless. So this issue is to track the huge partition optimization on local disk.How should we improve?
SubTasks
Are you willing to submit PR?