Ebiquity / CASIE

CyberAttack Sensing and Information Extraction
64 stars 22 forks source link

Can't find content.nostop.label and .content.json #8

Closed nansunsun closed 3 years ago

nansunsun commented 4 years ago

Does anyone know about how to get content.nostop.label and .content.json or .ner.json which are for train and test input? Thanks in advance. Cheers.

josephKhoury95 commented 3 years ago

Any update on this? Any help is highly appreciated. Thank you.

HuangZhenyang commented 3 years ago

I am stuck on this too. I don't know what data is like in the file “content.nostop.label”. Any ideas? Thx.

josephKhoury95 commented 3 years ago

I am stuck on this too. I don't know what data is like in the file “content.nostop.label”. Any ideas? Thx.

I invite you to check a helper code here for Inside, Outside, Beginning (IOB) tagging format and linguistic annotations using spaCy library for the CASIE corpus. [CASIE work is cited in the repository].

HuangZhenyang commented 3 years ago

@josephKhoury95 OK, I'll check that. Thank you so much, you are so nice. :D

Atleastihaveme commented 3 years ago

I am stuck on this too. I don't know what data is like in the file “content.nostop.label”. Any ideas? Thx.

I invite you to check a helper code here for Inside, Outside, Beginning (IOB) tagging format and linguistic annotations using spaCy library for the CASIE corpus. [CASIE work is cited in the repository].

Thx! How can I get the file with content.nostop.label and .content.json by [helper code]?And Would you know what is the trainfile and testfile of CASIE?Any help is highly appreciated. Thank you!

josephKhoury95 commented 3 years ago

Thx! How can I get the file with content.nostop.label and .content.json by [helper code]?And Would you know what is the trainfile and testfile of CASIE?Any help is highly appreciated. Thank you!

@Atleastihaveme, I gave up the provided code. I ended up using CASIE corpus only and annotated the corpus using the helper code. Then, I trained a CRF and a Bidirectional LSTM with CRF on my own.

Atleastihaveme commented 3 years ago

Thx! How can I get the file with content.nostop.label and .content.json by [helper code]?And Would you know what is the trainfile and testfile of CASIE?Any help is highly appreciated. Thank you!

@Atleastihaveme, I gave up the provided code. I ended up using CASIE corpus only and annotated the corpus using the helper code. Then, I trained a CRF and a Bidirectional LSTM with CRF on my own.

Thank you so much, you are so nice!Could you please share your code with me? I' m studying this now,and can't find useful resourse. I will be appreciate of it very much!

josephKhoury95 commented 3 years ago

Thank you so much, you are so nice!Could you please share your code with me? I' m studying this now,and can't find useful resourse. I will be appreciate of it very much!

I followed this article. Also, the author of the article provide his GitHub code.

Atleastihaveme commented 3 years ago

Thank you so much, you are so nice!Could you please share your code with me? I' m studying this now,and can't find useful resourse. I will be appreciate of it very much!

I followed this article. Also, the author of the article provide his GitHub code.

Thx, you are really so nice! Thanks again!

staneeya commented 3 years ago

Hi guys,

Sorry for the late reply. I have a lot on my plate right now. The content.json can be produced using Stanford CoreNLP . Below is an example.

4.content.json.zip

Another example for content.nostop.label which can easily produce by a simple script. 4.content.nostop.label.zip

Hope it helps.

HuangZhenyang commented 3 years ago

@staneeya Thanks a million. Hope everything goes well with your work. :D

Atleastihaveme commented 3 years ago

Thanks a lot! :D