Closed kmuenzen closed 2 years ago
@bayan6060 If you use the following command, the annotated (i.e. PHI-obscured) text files should output into the directory specified by the -a flag:
python3 ./generate_dataset/main_ucsf_updated.py -x ./data/i2b2_xml/ -o ./data/phi_notes_i2b2.json -n ./data/i2b2_notes/ -a ./data/i2b2_anno/
If there are no PHI tags specified in your input XML files (or the tags are not formatted correctly), the notes in the annotated folder will appear to be unannotated since there is technically no PHI to obscure.
Have you made sure that the tag format of your input XML files is the same as those in the example data/i2b2_xml/ folder in this repository?
Closing due to inactivity.
I have some xml files from i2b2 data sets. When trying to convert xml files into plain text and annotated text files, I just get plain text files without any phi annotated. Moreover, the JSON file inside the data folder does not have any phi. My xml file does not have anything between <TAGS? <\TAGS>. How can i generate phi tags inside it.? Thanks