MedKhem / grobid-dictionaries

31 stars 7 forks source link

Sample Documents aren't available #29

Open epugh opened 5 years ago

epugh commented 5 years ago

This may be "works as designed", however wanted to raise it. In the Docker setup instructions, the training docs are referenced (https://github.com/MedKhem/grobid-dictionaries/wiki/Docker_Instructions) as being in a Google Drive. That folder requires permission to access.

epugh commented 5 years ago

I'm working my way through the source code, trying to create my own Segmentation model based on learning how the DictionarySegmentationTrainer works, and so having some sample docs would be useful!

MedKhem commented 5 years ago

Hi, the docker instructions are supposed to be followed during a workshop that I tutor. For copyright issues, the visibility of the training data is limited. I could send you privately the data, if you give me an email address.

epugh commented 5 years ago

I’d love to chat with you more about the direction of the project, and see if I can’t provide some data as well that is public…

epugh@opensourceconnections.com

Eric

On Feb 24, 2019, at 4:47 PM, MedKhem notifications@github.com wrote:

Hi, the docker instructions are supposed to be followed during a workshop that I tutor. For copyright issues, the visibility of the training data is limited. I could send you privately the data, if you give me an email address.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/MedKhem/grobid-dictionaries/issues/29#issuecomment-466820563, or mute the thread https://github.com/notifications/unsubscribe-auth/AABXe3bRB2zzCedd0-JVCT3FsjHVxNCIks5vQwhvgaJpZM4bJrQD.


Eric Pugh | Founder & CEO | OpenSource Connections, LLC | 434.466.1467 | http://www.opensourceconnections.com http://www.opensourceconnections.com/ | My Free/Busy http://tinyurl.com/eric-cal
Co-Author: Apache Solr Enterprise Search Server, 3rd Ed https://www.packtpub.com/big-data-and-business-intelligence/apache-solr-enterprise-search-server-third-edition-raw
This e-mail and all contents, including attachments, is considered to be Company Confidential unless explicitly stated otherwise, regardless of whether attachments are marked as such.