Closed ghost closed 3 years ago
Final controlled vocabulary is available here.
To load platform data from the user-supplied spreadsheet requires some preprocessing to enable proper mapping and importing of the data into the database. The platform reference table (platform_ref) has five columns:
NOTE: The alt_id is only used to migrate existing data from the R4 database to R5. Once data migration is complete, this column can be dropped.
Given the above, the original spreadsheet needs to be modified accordingly before the data can be exported to a tab delimited file (tab is preferred as the field delimiter as some platform names contain a comma). Once the data file has been prepared it can be imported into the database using the mysqlimport program. Note that the import program can be configured (in my.cnf or my.ini) to accept tabs as field terminators and a CRLF as the line terminator (for Windows-generated files only). These options can also be specified as options on the command line, e.g.,
$mysqlimport --delete --fields-terminated-by='\t' --lines-terminated-by='\r\n' --ignore-lines=1 counter5 sql/platform_ref.tsv
This command will load reference data from the specified TSV file into the platform_ref table, emptying the table before it does so. It assumes the user and password options are also specified in my.cnf/my.ini; otherwise, include -u and -p options in the command line.
There are variations in platform names used by different aggregators. For example, one supplier will use ACM as the platform name while another will have it as ACM Digital Library. A controlled vocabulary will provide consistency in platform name usage, which should make it easier and improve overall clarity for end users.
The controlled vocabulary will be provided by the product team.