ibm-aur-nlp / PubLayNet

Other
915 stars 164 forks source link

Image IDs of Test set images #29

Open YeahMao opened 3 years ago

YeahMao commented 3 years ago

Hi PubLayNet Team,

Thank you for sharing the dataset. I am wondering is there a Json file (or a list of image id) of the Test set provided?

ajjimeno commented 3 years ago

Hi YehMao,

Currently, there is no public ground truth for the test set. The ground truth is being reserved for the following ICDAR competition:

https://icdar2021.org/competitions/competition-on-scientific-literature-parsing

YeahMao commented 3 years ago

Hi YehMao,

Currently, there is no public ground truth for the test set. The ground truth is being reserved for the following ICDAR competition:

https://icdar2021.org/competitions/competition-on-scientific-literature-parsing

Hi ajjimeno,

Thank you for your reply.

Yes, I participated in this competition, and I am trying to make a submission. I can't pass the phase if I don't have the correct image IDs. Therefore, I am wondering how can I get the corresponding id of an image. For train and val, I saw the IDs by checking the "train.json" and "val.json" files. For the test set, is there an alternative way to find the image IDs?

Thanks.

ajjimeno commented 3 years ago

Hi YeahMao,

Thank you for pointing it out. We are preparing the file linking the file names and ids and will release it soon.

On Wed, Nov 4, 2020 at 3:30 PM YeahMao notifications@github.com wrote:

Hi YehMao,

Currently, there is no public ground truth for the test set. The ground truth is being reserved for the following ICDAR competition:

https://icdar2021.org/competitions/competition-on-scientific-literature-parsing

Hi ajjimeno,

Thank you for your reply.

Yes, I participated in this competition, and I am trying to make a submission. I can't pass the phase if I don't have the correct image IDs. Therefore, I am wondering how can I get the corresponding id of an image. For train and val, I saw the IDs by checking the "train.json" and "val.json" files. For the test set, is there an alternative way to find the image IDs?

Thanks.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/ibm-aur-nlp/PubLayNet/issues/29#issuecomment-721505584, or unsubscribe https://github.com/notifications/unsubscribe-auth/AA6BZDNRW3FY3NXQCB4KPR3SODKFVANCNFSM4THGGZMA .

ajjimeno commented 3 years ago

I have made available the ids from the github of the challenge: https://github.com/ibm-aur-nlp/PubLayNet/blob/master/ICDAR_SLR_competition/test_ids.json

Best regards, Antonio

On Thu, Nov 5, 2020 at 3:22 PM Antonio Jimeno antonio.jimeno@gmail.com wrote:

Hi YeahMao,

Thank you for pointing it out. We are preparing the file linking the file names and ids and will release it soon.

On Wed, Nov 4, 2020 at 3:30 PM YeahMao notifications@github.com wrote:

Hi YehMao,

Currently, there is no public ground truth for the test set. The ground truth is being reserved for the following ICDAR competition:

https://icdar2021.org/competitions/competition-on-scientific-literature-parsing

Hi ajjimeno,

Thank you for your reply.

Yes, I participated in this competition, and I am trying to make a submission. I can't pass the phase if I don't have the correct image IDs. Therefore, I am wondering how can I get the corresponding id of an image. For train and val, I saw the IDs by checking the "train.json" and "val.json" files. For the test set, is there an alternative way to find the image IDs?

Thanks.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/ibm-aur-nlp/PubLayNet/issues/29#issuecomment-721505584, or unsubscribe https://github.com/notifications/unsubscribe-auth/AA6BZDNRW3FY3NXQCB4KPR3SODKFVANCNFSM4THGGZMA .

YeahMao commented 3 years ago

Hi Antonio,

Thank you for the update.

I have another question regarding the ICDAR competition. Is there a ground truth for the mini validation set (20 examples)?

I noticed that there is a "anntations.png" file provided which is converted from the "samples.json" file. On the competition page, there is a host submission that matches this Json file. Since the overall score of it is not one out of one, I reckon it is just a baseline? Or should we regard the annotations.png as a ground truth? Thank you in advance.

Best regards, Tony

On Sun, 8 Nov 2020 at 20:25, ajjimeno notifications@github.com wrote:

I have made available the ids from the github of the challenge:

https://github.com/ibm-aur-nlp/PubLayNet/blob/master/ICDAR_SLR_competition/test_ids.json

Best regards, Antonio

On Thu, Nov 5, 2020 at 3:22 PM Antonio Jimeno antonio.jimeno@gmail.com wrote:

Hi YeahMao,

Thank you for pointing it out. We are preparing the file linking the file names and ids and will release it soon.

On Wed, Nov 4, 2020 at 3:30 PM YeahMao notifications@github.com wrote:

Hi YehMao,

Currently, there is no public ground truth for the test set. The ground truth is being reserved for the following ICDAR competition:

https://icdar2021.org/competitions/competition-on-scientific-literature-parsing

Hi ajjimeno,

Thank you for your reply.

Yes, I participated in this competition, and I am trying to make a submission. I can't pass the phase if I don't have the correct image IDs. Therefore, I am wondering how can I get the corresponding id of an image. For train and val, I saw the IDs by checking the "train.json" and "val.json" files. For the test set, is there an alternative way to find the image IDs?

Thanks.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub < https://github.com/ibm-aur-nlp/PubLayNet/issues/29#issuecomment-721505584 , or unsubscribe < https://github.com/notifications/unsubscribe-auth/AA6BZDNRW3FY3NXQCB4KPR3SODKFVANCNFSM4THGGZMA

.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/ibm-aur-nlp/PubLayNet/issues/29#issuecomment-723551202, or unsubscribe https://github.com/notifications/unsubscribe-auth/AQ5AKFNZGKV2FV26Z6D5TMTSOZP2DANCNFSM4THGGZMA .

ajjimeno commented 3 years ago

Hi Tony,

The annotations for those files should be available from the development/validation set. Please let me know if you have any issue finding them.

Best regards, Antonio

On Mon, Nov 9, 2020 at 6:26 PM YeahMao notifications@github.com wrote:

Hi Antonio,

Thank you for the update.

I have another question regarding the ICDAR competition. Is there a ground truth for the mini validation set (20 examples)?

I noticed that there is a "anntations.png" file provided which is converted from the "samples.json" file. On the competition page, there is a host submission that matches this Json file. Since the overall score of it is not one out of one, I reckon it is just a baseline? Or should we regard the annotations.png as a ground truth? Thank you in advance.

Best regards, Tony

On Sun, 8 Nov 2020 at 20:25, ajjimeno notifications@github.com wrote:

I have made available the ids from the github of the challenge:

https://github.com/ibm-aur-nlp/PubLayNet/blob/master/ICDAR_SLR_competition/test_ids.json

Best regards, Antonio

On Thu, Nov 5, 2020 at 3:22 PM Antonio Jimeno antonio.jimeno@gmail.com wrote:

Hi YeahMao,

Thank you for pointing it out. We are preparing the file linking the file names and ids and will release it soon.

On Wed, Nov 4, 2020 at 3:30 PM YeahMao notifications@github.com wrote:

Hi YehMao,

Currently, there is no public ground truth for the test set. The ground truth is being reserved for the following ICDAR competition:

https://icdar2021.org/competitions/competition-on-scientific-literature-parsing

Hi ajjimeno,

Thank you for your reply.

Yes, I participated in this competition, and I am trying to make a submission. I can't pass the phase if I don't have the correct image IDs. Therefore, I am wondering how can I get the corresponding id of an image. For train and val, I saw the IDs by checking the "train.json" and "val.json" files. For the test set, is there an alternative way to find the image IDs?

Thanks.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub <

https://github.com/ibm-aur-nlp/PubLayNet/issues/29#issuecomment-721505584

,

or unsubscribe <

https://github.com/notifications/unsubscribe-auth/AA6BZDNRW3FY3NXQCB4KPR3SODKFVANCNFSM4THGGZMA

.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub < https://github.com/ibm-aur-nlp/PubLayNet/issues/29#issuecomment-723551202 , or unsubscribe < https://github.com/notifications/unsubscribe-auth/AQ5AKFNZGKV2FV26Z6D5TMTSOZP2DANCNFSM4THGGZMA

.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/ibm-aur-nlp/PubLayNet/issues/29#issuecomment-723818034, or unsubscribe https://github.com/notifications/unsubscribe-auth/AA6BZDP25SSPVKVPN6BOBMTSO6KSLANCNFSM4THGGZMA .