ucam-department-of-psychiatry / crate

Create and use de-identified research databases. Preprocess, extract text, anonymise/de-identify, link, apply natural language processing, query for research, manage consent for contact.
GNU General Public License v3.0
19 stars 8 forks source link

Free text #10

Closed fspivack closed 4 years ago

RudolfCardinal commented 4 years ago

Thank you! Have made some minor tweaks (see what you think -- including renaming the command-line parameter to match that used in the code, which I think is a clearer name). There was a variable typo (VARACHAR_MAX...) and I've tweaked the SQL Server lookup dictionary slightly -- my understanding is that e.g. NVARCHAR will never be present without a numeric limit in the 1-8000 range, but will have "MAX" or something for the giant version. Is that wrong? I've added references in the code.

The option "ddgen_table_defines_pids" is a good idea! Needs documenting in anon_config.rst; would you mind doing that and merging to master, please?