Closed tianranzhang closed 7 years ago
extract concepts should take a list of strings as input. What happens when you try:
concepts,error = mm.extract_concepts([results_set['criteria']])
Thanks for your reply. I tried to use [result_set['criteria']] and it returns the same error:
TypeError: a bytes-like object is required, not 'str'
As the original list of dict objects are stored in a list named 'result_set'
I tried to create a new list of strings ('result_string') by fetching the 'criteria' element of each object.
I ran the following code:
result_string=[] for i in range(0, len(result_set)-1): result_string.append(result_set[i]['criteria']) concepts, error = mm.extract_concepts(results_set)
Again it returns the error:
TypeError: a bytes-like object is required, not 'str'
Can you give a complete code example that I can use to try to recreate the error? Also, what OS are you using and what version of MetaMap?
When I try to run the following code here is what I get:
mm = MetaMap.get_instance('/opt/public_mm16/public_mm/bin/metamap16')
result_set={'criteria': 'Male physicians, ages 40 to 84. No history of stroke, myocardial infarction, cancer, or renal disease. No contraindications to aspirin or beta-carotene. No current usage of aspirin or Vitamin A tables greater than once per week.'}
concepts,error = mm.extract_concepts([result_set['criteria']])
Processing 00000000.tx.1: 'Male physicians, ages 40 to 84. No history of stroke, myocardial infarction, cancer, or renal disease. No contraindications to aspirin or beta-carotene. No current usage of aspirin or Vitamin A tables greater than once per week.'
for concept in concepts:
print concept
ConceptMMI(index='00000000', mm='MMI', score='30.42', preferred_name='Beta Carotene', cui='C0053396', semtypes='[orch,phsu,vita]', trigger='[".BETA.-CAROTENE"-tx-1-"beta-carotene"-noun-0]', location='TX', pos_info='139/13', tree_codes='D02.455.326.271.665.202.123;D02.455.426.392.368.367.379.249.050;D02.455.849.131.123;D23.767.261.050') ConceptMMI(index='00000000', mm='MMI', score='28.77', preferred_name='Vitamin A', cui='C0042839', semtypes='[orch,phsu,vita]', trigger='["VITAMIN A"-tx-1-"Vitamin A"-noun-0]', location='TX', pos_info='185/9', tree_codes='D02.455.326.271.665.202.495.818;D02.455.426.392.368.367.379.249.700.860;D02.455.849.131.495.818;D23.767.261.700.860;x.x.x.x') ConceptMMI(index='00000000', mm='MMI', score='26.00', preferred_name='N-acetyl-S-(alpha-methyl-4-(2-methylpropyl)benzeneacetyl)cysteine 4-(nitrooxy)butyl ester', cui='C1454756', semtypes='[orch]', trigger='["NO-aspirin"-tx-1-"No aspirin"-noun-0]', location='TX', pos_info='[104/2,128/7],[154/2,174/7]', tree_codes='x.x.x.x') ConceptMMI(index='00000000', mm='MMI', score='16.05', preferred_name='Cerebrovascular accident', cui='C0038454', semtypes='[dsyn]', trigger='["STROKE"-tx-1-"stroke"-noun-1]', location='TX', pos_info='47/6', tree_codes='C10.228.140.300.775;C14.907.253.855') ConceptMMI(index='00000000', mm='MMI', score='14.64', preferred_name='Glycosylation End Products, Advanced', cui='C0162574', semtypes='[bacs,orch]', trigger='["AGEs"-tx-1-"ages"-verb-0]', location='TX', pos_info='18/4', tree_codes='D12.776.643.500') ConceptMMI(index='00000000', mm='MMI', score='14.64', preferred_name='Kidney Diseases', cui='C0022658', semtypes='[dsyn]', trigger='["RENAL DISEASE, NOS"-tx-1-"renal disease"-noun-1]', location='TX', pos_info='89/13', tree_codes='C12.777.419;C13.351.968.419') ConceptMMI(index='00000000', mm='MMI', score='14.64', preferred_name='Myocardial Infarction', cui='C0027051', semtypes='[dsyn]', trigger='["Infarction, Myocardial"-tx-1-"myocardial infarction"-noun-1]', location='TX', pos_info='55/21', tree_codes='C14.280.647.500;C14.907.585.500') ConceptMMI(index='00000000', mm='MMI', score='13.14', preferred_name='Physicians', cui='C0031831', semtypes='[prog]', trigger='["Physicians"-tx-1-"physicians"-noun-0]', location='TX', pos_info='6/10', tree_codes='M01.526.485.810;N02.360.810') ConceptMMI(index='00000000', mm='MMI', score='9.88', preferred_name='contraindications aspect', cui='C0079164', semtypes='[qlco]', trigger='["contraindications"-tx-1-"contraindications"-noun-0]', location='TX', pos_info='107/17', tree_codes='x.x.x') ConceptMMI(index='00000000', mm='MMI', score='9.81', preferred_name='Males', cui='C0086582', semtypes='[orga]', trigger='["MALE"-tx-1-"Male"-noun-0]', location='TX', pos_info='1/4', tree_codes='x.x.x') ConceptMMI(index='00000000', mm='MMI', score='6.79', preferred_name='Data Table', cui='C1706074', semtypes='[inpr]', trigger='["Tables"-tx-1-"tables"-noun-0]', location='TX', pos_info='195/6', tree_codes='V02.930') ConceptMMI(index='00000000', mm='MMI', score='5.18', preferred_name='Age', cui='C0001779', semtypes='[orga]', trigger='["AGE"-tx-1-"ages"-verb-0]', location='TX', pos_info='18/4', tree_codes='') ConceptMMI(index='00000000', mm='MMI', score='5.18', preferred_name='Beta carotene measurement', cui='C0696105', semtypes='[lbpr]', trigger='["Beta Carotene"-tx-1-"beta-carotene"-noun-0]', location='TX', pos_info='139/13', tree_codes='') ConceptMMI(index='00000000', mm='MMI', score='5.18', preferred_name='Cancer Genus', cui='C0998265', semtypes='[euka]', trigger='["Cancer"-tx-1-"cancer"-noun-0]', location='TX', pos_info='78/6', tree_codes='') ConceptMMI(index='00000000', mm='MMI', score='5.18', preferred_name='Electrocardiogram: myocardial infarction (finding)', cui='C0428953', semtypes='[fndg]', trigger='["MYOCARDIAL INFARCTION"-tx-1-"myocardial infarction"-noun-1]', location='TX', pos_info='55/21', tree_codes='') ConceptMMI(index='00000000', mm='MMI', score='5.18', preferred_name='Greater Than', cui='C0439093', semtypes='[qnco]', trigger='["Greater Than"-tx-1-"greater than"-adj-0]', location='TX', pos_info='202/12', tree_codes='') ConceptMMI(index='00000000', mm='MMI', score='5.18', preferred_name='Malignant Neoplasms', cui='C0006826', semtypes='[neop]', trigger='["CANCER"-tx-1-"cancer"-noun-1]', location='TX', pos_info='78/6', tree_codes='') ConceptMMI(index='00000000', mm='MMI', score='5.18', preferred_name='Myocardial Infarction ECG Assessment', cui='C3810814', semtypes='[diap]', trigger='["Myocardial Infarction"-tx-1-"myocardial infarction"-noun-0]', location='TX', pos_info='55/21', tree_codes='') ConceptMMI(index='00000000', mm='MMI', score='5.18', preferred_name='Myocardial infarction:Finding:Point in time:^Patient:Ordinal', cui='C2926063', semtypes='[clna]', trigger='["Myocardial infarction"-tx-1-"myocardial infarction"-noun-0]', location='TX', pos_info='55/21', tree_codes='') ConceptMMI(index='00000000', mm='MMI', score='5.18', preferred_name='Once a week', cui='C0558293', semtypes='[tmco]', trigger='["Once per week"-tx-1-"once per week"-noun-0]', location='TX', pos_info='215/13', tree_codes='') ConceptMMI(index='00000000', mm='MMI', score='5.18', preferred_name='Primary malignant neoplasm', cui='C1306459', semtypes='[neop]', trigger='["Cancer"-tx-1-"cancer"-noun-1]', location='TX', pos_info='78/6', tree_codes='') ConceptMMI(index='00000000', mm='MMI', score='3.77', preferred_name='No history of', cui='C0332122', semtypes='[qlco]', trigger='["No history of"-tx-1-"No history of"-noun-0]', location='TX', pos_info='33/13', tree_codes='') ConceptMMI(index='00000000', mm='MMI', score='3.63', preferred_name='Table - furniture', cui='C0039224', semtypes='[mnob]', trigger='["tables"-tx-1-"tables"-noun-0]', location='TX', pos_info='195/6', tree_codes='') ConceptMMI(index='00000000', mm='MMI', score='3.59', preferred_name='/40', cui='C0439509', semtypes='[tmco]', trigger='["/40"-tx-1-"40"-integer-0]', location='TX', pos_info='23/2', tree_codes='') ConceptMMI(index='00000000', mm='MMI', score='3.59', preferred_name='40%', cui='C3842587', semtypes='[qnco]', trigger='["40%"-tx-1-"40"-integer-0]', location='TX', pos_info='23/2', tree_codes='') ConceptMMI(index='00000000', mm='MMI', score='3.56', preferred_name='Usage', cui='C0457083', semtypes='[ftcn]', trigger='["Usage"-tx-1-"usage"-noun-0]', location='TX', pos_info='165/5', tree_codes='') ConceptMMI(index='00000000', mm='MMI', score='3.53', preferred_name='Vitamin A Drug Class', cui='C3714656', semtypes='[phsu,vita]', trigger='["VITAMIN A"-tx-1-"Vitamin A"-noun-0]', location='TX', pos_info='185/9', tree_codes='') ConceptMMI(index='00000000', mm='MMI', score='3.53', preferred_name='Vitamin A [EPC]', cui='C2825076', semtypes='[vita]', trigger='["Vitamin A"-tx-1-"Vitamin A"-noun-0]', location='TX', pos_info='185/9', tree_codes='') ConceptMMI(index='00000000', mm='MMI', score='3.50', preferred_name='Male Gender, Self Report', cui='C1706180', semtypes='[qlco]', trigger='["Male"-tx-1-"Male"-noun-0]', location='TX', pos_info='1/4', tree_codes='') ConceptMMI(index='00000000', mm='MMI', score='3.50', preferred_name='Male Phenotype', cui='C1706428', semtypes='[qlco]', trigger='["Male"-tx-1-"Male"-noun-0]', location='TX', pos_info='1/4', tree_codes='') ConceptMMI(index='00000000', mm='MMI', score='3.50', preferred_name='Male, Self-Reported', cui='C1706429', semtypes='[orga]', trigger='["Male"-tx-1-"Male"-noun-0]', location='TX', pos_info='1/4', tree_codes='') ConceptMMI(index='00000000', mm='MMI', score='3.43', preferred_name='Current (present time)', cui='C0521116', semtypes='[tmco]', trigger='["CURRENT"-tx-1-"current"-adj-0]', location='TX', pos_info='157/7', tree_codes='') ConceptMMI(index='00000000', mm='MMI', score='3.43', preferred_name='Electrical Current', cui='C1705970', semtypes='[npop]', trigger='["Current"-tx-1-"current"-adj-0]', location='TX', pos_info='157/7', tree_codes='') `
Sorry for the lack of info provided. I am using Mac 10.9.5 and metamap16. I opened up a new file and ran these codes after the necessary packages imported:
mm = MetaMap.get_instance('/Users/zhangtianran/Downloads/public_mm2/bin/metamap16')
result_set={'criteria': 'Male physicians, ages 40 to 84. No history of stroke, myocardial infarction, cancer, or renal disease. No contraindications to aspirin or beta-carotene. No current usage of aspirin or Vitamin A tables greater than once per week.'}
concepts,error = mm.extract_concepts([result_set['criteria']])
And it's still returning the same error. I am starting to doubt that maybe I am not setting up metamap server in the right way... Do I need to make sure that the metamap server is running before I use this package?
Thanks.
I have only ran this using linux. Let me try setting up MetaMap on my Mac and see if I can reproduce the error.
Thank you so much!!
Tianranzhang, Are you trying to use this code on python 3.x? I think that may be the problem, if you are not using python 2.7. I have only tested using python 2.7.
-edit- I just tested on my Mac and everything seems to work. Make sure you have ran ./bin/skrmedpostctl start and ./bin/wsdserverctl start
I think it may be an issue with using python 3.x instead of 2.7. If you're using a Python 3.x version.
Were you able to get this running?
Yes, it worked under python 2.7, I am still trying to get it fixed for python 3.5...
Sure. I hope it works for you under 2.7 for your current needs. I will need to update the code to work with python 3.5 when I have a chance.
Thank you! I will look into potential solution as well.
Thanks for developing this wrapper!
This should now work under 3.5
Hi, I just discovered the Pymetamap package today and I am new to python. I am using this package to analyze clinical trial inclusion criteria retrieved from mysql database in the form of dict object. This is the object I took for experimental analysis:
result_set={'criteria': 'Male physicians, ages 40 to 84. No history of stroke, myocardial infarction, cancer, or renal disease. No contraindications to aspirin or beta-carotene. No current usage of aspirin or Vitamin A tables greater than once per week.'}
I first converted the dict object to string:
str_json = json.dumps(result_set)
When I followed the example usage code and tried to run the lineconcepts,error = mm.extract_concepts(str_json)
it returns the error:
Then I tried to convert to bytes format by running:
data=str.encode(str_json)
And checked the type of the newly generated object:type(data)
It shows that data is of type 'bytes' already.
Thus I ran the concept extraction code again:
concepts,error = mm.extract_concepts(data)
And it still returns the same error asking for a 'bytes-like object'.
Could you please help me figure out what is wrong here? Is there anything I should look into other than the data type conversion (since I already converted the data type)?
I am currently using Python 3 (Anaconda environment).
Thank you so much!!
Tianran