apache / celeborn

Apache Celeborn is an elastic and high-performance service for shuffle and spilled data.
https://celeborn.apache.org/
Apache License 2.0
886 stars 359 forks source link

[CELEBORN-1674] Fix reader thread name of MapPartitionData #2853

Closed SteNicholas closed 1 week ago

SteNicholas commented 1 week ago

What changes were proposed in this pull request?

Fix reader thread name of MapPartitionData which contains null.

Why are the changes needed?

The reader thread name of MapPartitionData has null at present, which is caused by MapFileMeta#getMountPoint that returns null. The reader thread name of MapPartitionData is as follows:

celeborn@jscs-bigdata-rss-worker:/data/service/celeborn$ jstack 65|grep reader-thread
"null-reader-thread-7" #798 prio=5 os_prio=0 tid=0x00007ef03bca8000 nid=0x47f waiting on condition [0x00007eef068cb000]
"null-reader-thread-7" #799 prio=5 os_prio=0 tid=0x00007ef03a097000 nid=0x47e waiting on condition [0x00007eef069cc000]
"null-reader-thread-5" #796 prio=5 os_prio=0 tid=0x00007ef03a818000 nid=0x47d waiting on condition [0x00007eef06acd000]
"null-reader-thread-6" #797 prio=5 os_prio=0 tid=0x00007ef03b896800 nid=0x47c waiting on condition [0x00007eef06bce000]
"null-reader-thread-4" #793 prio=5 os_prio=0 tid=0x00007ef03ac6b000 nid=0x47b waiting on condition [0x00007eef06ccf000]
"null-reader-thread-6" #794 prio=5 os_prio=0 tid=0x00007ef05829e800 nid=0x47a waiting on condition [0x00007eef06dd0000]
"null-reader-thread-7" #795 prio=5 os_prio=0 tid=0x00007ef03b06b800 nid=0x479 waiting on condition [0x00007eef06ed1000]
"null-reader-thread-3" #789 prio=5 os_prio=0 tid=0x00007ef03a095000 nid=0x478 waiting on condition [0x00007eef06fd2000]
"null-reader-thread-3" #790 prio=5 os_prio=0 tid=0x00007ef03a817000 nid=0x477 waiting on condition [0x00007eef070d3000]
"null-reader-thread-4" #791 prio=5 os_prio=0 tid=0x00007ef03b895000 nid=0x476 waiting on condition [0x00007eef071d4000]
"null-reader-thread-5" #792 prio=5 os_prio=0 tid=0x00007ef03b06a800 nid=0x475 waiting on condition [0x00007eef072d5000]
"null-reader-thread-4" #786 prio=5 os_prio=0 tid=0x00007ef03d06b800 nid=0x474 waiting on condition [0x00007eef073d6000]
"null-reader-thread-5" #787 prio=5 os_prio=0 tid=0x00007ef03bca8800 nid=0x473 waiting on condition [0x00007eef074d7000]
"null-reader-thread-3" #785 prio=5 os_prio=0 tid=0x00007ef03c884800 nid=0x472 waiting on condition [0x00007eef075d8000]
"null-reader-thread-6" #788 prio=5 os_prio=0 tid=0x00007ef03cc6b800 nid=0x471 waiting on condition [0x00007eef076d9000]
"null-reader-thread-2" #783 prio=5 os_prio=0 tid=0x00007ef03c06a000 nid=0x470 waiting on condition [0x00007eef077da000]
"null-reader-thread-2" #784 prio=5 os_prio=0 tid=0x00007ef05829d000 nid=0x46f waiting on condition [0x00007eef078db000]
"null-reader-thread-2" #782 prio=5 os_prio=0 tid=0x00007ef03a815800 nid=0x46e waiting on condition [0x00007eef079dc000]
"null-reader-thread-1" #781 prio=5 os_prio=0 tid=0x00007ef01d852000 nid=0x46d waiting on condition [0x00007eef07add000]
"null-reader-thread-1" #780 prio=5 os_prio=0 tid=0x00007ef03a815000 nid=0x46c waiting on condition [0x00007eef07bde000]
"null-reader-thread-1" #779 prio=5 os_prio=0 tid=0x00007ef03ac6c800 nid=0x46b waiting on condition [0x00007eef07cdf000]
"null-reader-thread-0" #777 prio=5 os_prio=0 tid=0x00007ef03d06a800 nid=0x46a waiting on condition [0x00007eef07de0000]
"null-reader-thread-0" #778 prio=5 os_prio=0 tid=0x00007ef03ac6b800 nid=0x469 waiting on condition [0x00007eef07ee1000]
"null-reader-thread-0" #776 prio=5 os_prio=0 tid=0x00007ef03a095800 nid=0x468 waiting on condition [0x00007eef07fe2000]
[ERROR][null-reader-thread-6] - org.apache.celeborn.service.deploy.worker.storage.MapPartitionData -MapPartitionData.java(205) -reader exception, reader: DataPartitionReader

{startPartitionIndex=834, endPartitionIndex=834, streamId=1774189696911}
, message: Partition reader has been failed or finished.

Does this PR introduce any user-facing change?

No.

How was this patch tested?

GA.