ibm-aur-nlp / PubLayNet

Other
900 stars 165 forks source link

Download questions #1

Closed phexic closed 4 years ago

phexic commented 5 years ago

Great job,thank you for sharing such large-scale document data. However,the speed which i download these datasets is very slow. And, it often disconnected downloads, is there any other way to get these datasets?

zhxgj commented 5 years ago

@phexic thanks for your interest. We will look into this issue and solve it asap.

phexic commented 5 years ago

@zhxgj Great!

zhxgj commented 5 years ago

@phexic we tested a few geographic regions and got decent downloading speed from Box. Can you please let us know which geographic region are you downloading the data from?

phexic commented 5 years ago

@zhxgj Maybe the reason I'm in China

zhxgj commented 5 years ago

@phexic Em, maybe Box does not well in China. Let me try to work out a solution for you.

phexic commented 5 years ago

@zhxgj Oh, Wow! thanks a million.
Will you public the pre-training models about document layout?

zhxgj commented 5 years ago

@zhxgj Oh, Wow! thanks a million. Will you public the pre-training models about document layout?

@phexic This is a great suggestion. I will follow up with our legal team regarding releasing the pre-trained model and maybe the training config file.

shreyansh05s commented 4 years ago

@zhxgj Oh, Wow! thanks a million. Will you public the pre-training models about document layout?

@phexic This is a great suggestion. I will follow up with our legal team regarding releasing the pre-trained model and maybe the training config file.

@zhxgj Hi, any news from your legal team whether you can release the pre-trained models?

zhxgj commented 4 years ago

Hi @phexic The data has been migrated to IBM DAX platform. I think the download should be more stable now. Please see the instructions in README