dhimmel / hsdn

Analysis of the human symptoms–disease network
http://git.dhimmel.com/hsdn
19 stars 13 forks source link

请问disease-symptom-cooccurrence.txt文件中各个字段分别代表什么,哪个字段是计算得到的基于症状的疾病相似性权重 #2

Closed 20182204168 closed 3 years ago

dhimmel commented 3 years ago

I google translated this to:

What does each field in the disease-symptom-cooccurrence.txt file represent? Which field is the calculated symptom-based disease similarity weight?

Are you asking about this portion of the README:

See dhimmel/MEDLINE for our MEDLINE topic extraction and cooccurrence analysis. The symptom--disease relationships are available in disease-symptom-cooccurrence.tsv.

Note that this file is produced by a different repository. See the notebook symptoms.ipynb for how that repository computes disease-symptom cooccurrence. If you still have a question, please open a new issue in that repository. English preferred.

20182204168 commented 3 years ago

Could you please tell me which colum in disease-symptom-cooccurrence.txt is disease–disease network based on symptom similarity?

dhimmel commented 3 years ago

Here's the head of disease-symptom-cooccurrence.tsv:

doid_code doid_name mesh_id mesh_name cooccurrence expected enrichment odds_ratio p_fisher
DOID:10652 Alzheimer's disease D004314 Down Syndrome 800 35.61960058033457 22.459544379104482 39.91835215738615 0.0
DOID:10652 Alzheimer's disease D008569 Memory Disorders 1593 76.58053241300478 20.80163129982992 41.88587656879804 0.0
DOID:10652 Alzheimer's disease D011595 Psychomotor Agitation 334 15.235664746873008 21.922246619961278 35.277329065088274 0.0
DOID:10652 Alzheimer's disease D000647 Amnesia 307 14.061215405244994 21.83310554260377 34.89009878832037 4.277452035e-314
DOID:10652 Alzheimer's disease D006816 Huntington Disease 255 12.130613747774285 21.021195242226486 32.63003507014028 8.215868081643624e-256
DOID:10652 Alzheimer's disease D000849 Anomia 115 3.330287859136972 34.53154948287314 77.95753484320558 1.3820847245462235e-147
DOID:10652 Alzheimer's disease D007806 Language Disorders 198 18.662482688883514 10.609520892841369 12.99293205444876 4.33028234641986e-135

This file contains disease-symptom similarity. It does not contain disease–disease similarity based on symptoms. Neither the dhimmel/hsdn or dhimmel/MEDLINE repositories compute disease–disease similarity using symptoms. dhimmel/MEDLINE does compute a disease-disease-cooccurrence.tsv file based on literature cooccurrence of the diseases (not symptom similarity).

20182204168 commented 3 years ago

Sorry,I cannot download the file named disease-disease-cooccurrence.tsv,could you uploads the file please

dhimmel commented 3 years ago

I cannot download the file named disease-disease-cooccurrence.tsv,could you uploads the file please

You need to click the Download button. Here is a direct link to download the file: https://github.com/dhimmel/medline/raw/0c9e2905ccf8aae00af5217255826fe46cba3d30/data/disease-disease-cooccurrence.tsv