Clarification on M3D-Seg Test Set Splitting in Paper vs. README - Githubissues

BAAI-DCAI / M3D

M3D: Advancing 3D Medical Image Analysis with Multi-Modal Large Language Models

MIT License

208 stars 10 forks source link

Clarification on M3D-Seg Test Set Splitting in Paper vs. README #30

Open yunfeixie233 opened 3 weeks ago

yunfeixie233 commented 3 weeks ago

Hello @baifanxxx,

I noticed an inconsistency regarding the test set splitting for the M3D-Seg dataset between the paper and the dataset's README on Hugging Face.

In the paper, it is stated:

"In M3D-Seg, 20% of the data from AbdomenCT-1K [42], Totalsegmnetator [66], and CT-Organ [53] is allocated as the test set for both semantic segmentation and referring expression segmentation."

However, in the Hugging Face README, it mentions:

"Each sub-dataset folder is split into train and test parts through a JSON file, including other dataset 20%."

Could you please clarify which data splitting method is correct? How can I reproduce the results as reported in the paper?

Thank you for your assistance.

baifanxxx commented 3 weeks ago

Hi,

Thank you for your attention. Both statements are correct. We split all segmentation datasets into training and testing sets with an 8:2 ratio, consistent with SegVol. However, due to space limitations in the paper, we report the results of some datasets, such as AbdomenCT-1K, Totalsegmnetator, and CT-Organ. We welcome you to compare with us on any dataset using the same split, not limited to the three datasets presented in the paper.