cuilimeng / DETERRENT

29 stars 10 forks source link

Training/Testing json data #1

Open arxrean opened 4 years ago

arxrean commented 4 years ago

Thank you for sharing the code. But seems the training json file only contains 1 json data. Is there any guide for generating more for training?

Besides, is it possible to upload a pretrained model? So we can just run through the testing process to get the reasonable results.

Thank you!

isspek commented 3 years ago

Hi @arxrean,

I also would like to use this study for my research. I figured out how we could generate training data. Under the data folder, there is a python file called dataprocessing.py. You need complete files from the Knowledge Base, which are entities, relations, and triples (output.csv). Then you can generate train and test files.

However, I couldn't proceed with the second step, training. That one also needs entities, relations files in this time .txt format. I wonder whether they are the same but txt version of entities and triple under the data.

In the paper, the graph is trained with R-GCN. I couldn't figure out where this process is located.

Dear @cuilimeng, you may clarify our confusion and questions. Kind regards.

arxrean commented 3 years ago

Hi @isspek , Thank you for your help! Yes, there is no complete knowledge base data so I cannot run the code at that time.

I think they are the same because there is only one medical knowledge graph used.

I think the R-GCN part is located at TextRelationalGraphAttention.py (and also DETERRENT.py).

I suggest you can check another github project (https://github.com/esddse/GUpdater). It has very similar basic code (text+GRU+RGCN) and can run smoothly. Then you can come back to check this one if you want to know some specific parts (e.g., positive/negative relations).

Best,

cuilimeng commented 3 years ago

Hi @arxrean and @isspek,

The files entity2id.csv, output.csv and token2id.csv are from the medical knowledge graph KnowLife. The term of use of this dataset prohibit us to distribute it to other parties, even if the purpose is noncommercial or educational usage by these other parties. So that's why the dataset is incomplete. Sorry for any inconvenience.

Limeng

arxrean commented 3 years ago

@cuilimeng No problem! Thank you again for sharing the code.

isspek commented 3 years ago

@arxrean, thanks for your suggestion, I will check that code too. @cuilimeng thanks for the clarification and for the nice work!!

mtoles commented 2 years ago

Does anyone know how to get access to KnowLife? The link has no useful or contact info.