qq547276542 / Agriculture_KnowledgeGraph

农业知识图谱(AgriKG):农业领域的信息检索,命名实体识别,关系抽取,智能问答,辅助决策
GNU General Public License v3.0
3.99k stars 1.56k forks source link

你好,关系自动抽取这块我想换成我自己的数据,但是里面的格式怎么改阿 #46

Open grance1 opened 5 years ago

CrisJk commented 5 years ago

你好,把你的数据转换成如下格式的json就可以:

{ "head": { "id": "/guid/9202a8c04000641f800000000094674d", "type": "/common/topic,/location/neighborhood,/location/location", "word": "Mount Washington" }, "relation": "/location/neighborhood/neighborhood_of", "sentence": "And meteorologists profiled an astounding storm : 7.57 inches of rain in Central Park on Sunday made it the city 's second-wettest day since recordkeeping began in 1869 ; other rain records were set in Philadelphia , Newark , Trenton , Baltimore , Washington and Bridgeport , Conn. ; 26 inches of snow hit Tupper Lake , N.Y. , and winds were clocked at 72 miles an hour in Milton , Mass. , 81 m.p.h. in Cape Elizabeth , Me. , and 156 m.p.h. at Mount Washington in New Hampshire .", "tail": { "id": "/guid/9202a8c04000641f800000000004921a", "type": "/government/governmental_jurisdiction,/people/place_of_interment,/location/location,/location/citytown,/common/topic,/user/skud/names/namesake,/sports/sports_team_location,/user/nitromaster101/default_domain/abh_city,/fictional_universe/fictional_setting,/user/skud/names/topic,/business/employer,/location/hud_foreclosure_area,/location/administrative_division,/user/brendan/default_domain/top_architectural_city,/film/film_location,/base/gardening/topic,/base/petbreeds/topic,/location/place_with_neighborhoods,/business/business_location,/architecture/architectural_structure_owner,/visual_art/art_owner,/location/dated_location,/base/petbreeds/city_with_dogs,/location/us_county,/location/statistical_region", "word": "Baltimore" } }, { "head": { "id": "/guid/9202a8c04000641f800000000000d0f6", "type": "/government/governmental_jurisdiction,/location/dated_location,/base/localfood/topic,/government/political_district,/book/book_subject,/base/oakland/topic,/organization/organization_scope,/base/localfood/food_producing_region,/base/ontologies/ontology_instance,/location/administrative_division,/base/locations/states_and_provences,/user/tsegaran/random/taxonomy_subject,/user/skud/names/topic,/location/statistical_region,/base/locations/topic,/common/topic,/freebase/apps/hosts/com/acre/juggle/juggle,/base/seafood/topic,/meteorology/cyclone_affected_area,/base/database/topic,/broadcast/genre,/wine/wine_region,/business/business_location,/base/popstra/topic,/wine/appellation,/people/place_of_interment,/fictional_universe/fictional_setting,/location/us_state,/base/seafood/fishery_location,/user/skud/names/name_source,/location/location,/film/film_location,/wine/vineyard,/user/robert/default_domain/states_i_ve_been_to,/base/popstra/sww_base,/base/database/database_financial_supporter,/base/popstra/location,/business/employer", "word": "California" }, "relation": "/location/location/contains", "sentence": "And Perrianne Simkhovitch , a curly-haired woman scribbling in a journal , said she lived in Humboldt Redwoods State Park in California , had been in town for two weeks , and was dropping in on physics departments around the city to pose questions about '' a time asymmetry where time runs backward . ''", "tail": { "id": "/guid/9202a8c04000641f80000000002081bb", "type": "/protected_sites/protected_site,/common/topic,/location/dated_location,/base/ecology/ecosystem,/location/location", "word": "Humboldt Redwoods State Park" } },

grance1 commented 5 years ago

你好,我想咨询一下,id你是通过什么方式得到的?或者说是怎么定义的呢

CrisJk commented 5 years ago

自己生成,保证一致就行

grance1 commented 5 years ago

哦,我是通过把所有的词形成词典,然后得到每个词的id

grance1 commented 5 years ago

谢谢你

grance1 commented 5 years ago

你好,我想咨询一下,在模型的参数设置上,为什么是max_epoch,而不是epoch?

grance1 commented 5 years ago

我想问一个问题,就是你们在做实体识别的时候为什么不用主流的算法CNN-CRF等算法呢?

qq547276542 commented 5 years ago

@grance1 主要原因cnn/lstm crf需要大量数据支持,而我们缺乏标注数据,因此选择了无监督的方法。 此外,我们的实体种类较多,用序列标注方法的效果可能不好(特征只有输入序列的embedding),而我们的方法能够将百科中实体的各种特征引入。