DatastoreIO can't read large dataset from datastore

we got a google datastore Entities with row count 61,647,456, the total size of the entities is around 21.4G. when we try to read the whole table with code below.

final Pipeline pipeline = Pipeline.create(options);
    pipeline.apply("read from datastore",
                   DatastoreIO.v1().read().withProjectId(options.getProject())
                       .withNamespace(CITE_NAMESPACE)
                       .withQuery(queryBuilder.build()))
                  .apply("xx",ParDo.of(new DoFn(){});

pipeline.run()

I we have tried 3 times with

5 workers on machine type n1-standard-1 failed at 6,531,183 rows of read
7 workers on machine type n1-standard-4 failed at 19,284,163 rows of read
30 workers on machine type n1-standard-2 failed at 5,744,184 rows of read

they all failed at stage "read from datastore" when trying to GroupByKey, for 3rd run, I think it's we have provided enough compute resources to just read from datastore, it sill failed, can read the lowest number of data.

is there anything we can do to tune the steps?

(c4effe748a3cb249): Workflow failed. Causes: (4c8d3a94c2fb7abe): S09:read from datastore/GroupByKey/Read+read from datastore/GroupByKey/GroupByWindow+read from datastore/Values/Values+read from datastore/Flatten.FlattenIterables/FlattenIterables+read from datastore/ParDo(Read)+get family id without applicant+write family id out/Write/DataflowPipelineRunner.BatchWrite/Window.Into()+write family id out/Write/DataflowPipelineRunner.BatchWrite/WriteBundles+write family id out/Write/DataflowPipelineRunner.BatchWrite/View.AsIterable/DataflowPipelineRunner.BatchViewAsIterable/ParDo(ToIsmRecordForGlobalWindow) failed.

GoogleCloudPlatform / DataflowJavaSDK

DatastoreIO can't read large dataset from datastore #537