@ycq091044
Hi Yang:
When I try to preprocess the MIMIC-III data with your preprocessing codes, I found that there is an error in line 8 of ddi_mask_H.py, showing idx2drug.pkl not found. Is idx2drug.pkl generated by get_SMILES.py? But when I try to run
get_SMILES.py, an error in line 10 occurred with hint "DataFrame" object has no attribute 'ATC4'. It seems that 'ndc2atc' has attribute 'ATC5' instead of 'ATC4'. Also in line 14 of "get_SMILES.py", it seems that atc2ndc doesn't have atttibute 'NDC'.
Is there anything wrong about the data processing files, or am I getting wrong at the data processing procedures. Thanks in advance!
I recently update the file name: id2drug.pkl -> id2SMILES.pkl and forget to update in ddi_mask_H.py.
I have just updated ddi_mask_H.py. Please try that.
For the second question.
please do not use the get_SMILES.py script now. As mentioned in README, this script queries the drugbank website and extracts the molecule SMILES string. However, the html structure of Drugbank.com recently changes, so this script should also update as well.
@ycq091044 Hi Yang: When I try to preprocess the MIMIC-III data with your preprocessing codes, I found that there is an error in line 8 of ddi_mask_H.py, showing idx2drug.pkl not found. Is idx2drug.pkl generated by get_SMILES.py? But when I try to run get_SMILES.py, an error in line 10 occurred with hint "DataFrame" object has no attribute 'ATC4'. It seems that 'ndc2atc' has attribute 'ATC5' instead of 'ATC4'. Also in line 14 of "get_SMILES.py", it seems that atc2ndc doesn't have atttibute 'NDC'. Is there anything wrong about the data processing files, or am I getting wrong at the data processing procedures. Thanks in advance!