rafguns / wosfile

Handle Clarivate Analytics Web of Science™ export files
Other
26 stars 17 forks source link

KeyError: 'Z8' #20

Open TimRepke opened 2 months ago

TimRepke commented 2 months ago

It seems like WoS recently changed their export format:

    if has_item_per_line[heading]:  # Iterable field with one item per line
       ~~~~~~~~~~~~~~~~~^^^^^^^^^
KeyError: 'Z8'

Based on the table at the bottom of this page: https://webofscience.help.clarivate.com/en-us/Content/export-records.htm (already mentioned in #14)

I automatically added all missing entries to the tags.py:

    ("A2", "Other Abstract", False, False),
    ("AA", "Additional Authors", False, False),
    ("AD", "Application Details and Date", False, False),
    ("AE", "Patent Assignee", False, False),
    ("AK", "Abstract (Korean)", False, False),
    ("AN", "Accession Number or PubMedID", False, False),
    ("AW", "Item URL", False, False),
    ("BD", "Broad Descriptors or Broad Terms", False, False),
    ("C2", "Address (non-English)", False, False),
    ("CC", "Concept Codes or CABI Codes", False, False),
    ("CE", "Edition", False, False),
    ("CH", "Chemicals & Biochemicals", False, False),
    ("CI", "Derwent Compound Number", False, False),
    ("CN", "CAS Registry Numbers; Commercial Names; Chemical", False, False),
    ("CO", "CODEN", False, False),
    ("CP", "Cited Patent(s)", False, False),
    ("DC", "Derwent Class Code(s)", False, False),
    ("DF", "Date Filed or Submitted", False, False),
    ("DL", "DOI Link", False, False),
    ("DM", "Demography", False, False),
    ("DN", "DCR Number", False, False),
    ("DP", "Discipline; Diseases", False, False),
    ("DS", "Designated States", False, False),
    ("DY", "Data Type", False, False),
    ("EC", "Category", False, False),
    #("EF", "End of File", False, False),
    #("ER", "End of Record", False, False),
    ("FD", "Further Application Details", False, False),
    ("FN", "File Name", False, False),
    ("FP", "Funding Name Preferred", False, False),
    ("FS", "Field of Search", False, False),
    ("FT", "Foreign Title", False, False),
    ("GE", "Geographic Data", False, False),
    ("GI", "Grant Information", False, False),
    ("GN", "Gene Name", False, False),
    ("GS", "Geospatial", False, False),
    ("GT", "Time", False, False),
    ("IO", "Issuing Organization", False, False),
    ("IP", "International Patent Classification", False, False),
    ("IV", "Investigators", False, False),
    ("JC", "NLM Unique ID", False, False),
    ("LS", "Language of Summary", False, False),
    ("LT", "Literature Type", False, False),
    ("MC", "Major Concepts or Derwent Manual Code(s)", False, False),
    ("ME", "Medium", False, False),
    ("MH", "MeSH Terms", False, False),
    ("MI", "Miscellaneous Descriptors", False, False),
    ("MN", "Markush Number", False, False),
    ("MQ", "Methods & Equipment", False, False),
    ("NM", "Personal Name Subject", False, False),
    ("NO", "Comments, Corrections, Erratum", False, False),
    ("NP", "Named Person", False, False),
    ("NT", "Notes", False, False),
    ("OB", "Record Owner", False, False),
    ("OC", "Country of Original Patent Application Number", False, False),
    ("OD", "Method", False, False),
    ("OP", "Original Patent Application Number", False, False),
    ("OR", "Organism Descriptors; Systematics", False, False),
    ("OS", "Original Source", False, False),
    ("P1", "Part Number", False, False),
    ("PC", "Country of Patent", False, False),
    ("PE", "Published Electronically", False, False),
    ("PR", "Parts, Structures & Systems; Price", False, False),
    ("PS", "Pages", False, False),
    ("PV", "Place of Publication", False, False),
    ("RC", "Date Created, Date Completed, Date Revised", False, False),
    ("RG", "Derwent Registry Number", False, False),
    ("S1", "Source Title (non-English)", False, False),
    ("SA", "Status", False, False),
    ("SD", "Molecular Sequence Data", False, False),
    ("SF", "Space Flight Mission", False, False),
    ("SS", "FSTA Section/Subsection; Citation Subset", False, False),
    ("ST", "Super Taxa", False, False),
    ("TA", "Taxonomic Data", False, False),
    ("TF", "Technology Focus Abstract", False, False),
    ("TL", "Country of Translation", False, False),
    ("TM", "Geologic Time Data", False, False),
    ("TN", "Taxa Notes", False, False),
    ("TR", "Translators", False, False),
    ("TS", "Translated Source", False, False),
    ("UC", "Document Selection URL", False, False),
    ("UR", "URL", False, False),
    ("VN", "Version", False, False),
    ("VR", "Version Number", False, False),
    ("WP", "Publisher Web Address", False, False),
    ("X1", "Article Title (non-English)", False, False),
    ("X2", "Article Title (Transliterated)", False, False),
    ("X4", "Spanish Abstract", False, False),
    ("X5", "Spanish Author Keywords", False, False),
    ("Y1", "Portuguese Document Title", False, False),
    ("Y4", "Portuguese Abstract", False, False),
    ("Y5", "Author Keywords (non-English); Portuguese Author Keywords", False, False),
    ("Z1", "Article Title (Other Languages)", False, False),
    ("Z2", "Authors (non-English)", False, False),
    ("Z3", "Publication Name (Chinese)", False, False),
    ("Z4", "Abstract (non-English)", False, False),
    ("Z5", "Author Keywords (non-English)", False, False),
    ("Z6", "Author Address (non-English)", False, False),
    ("Z7", "E-mail Address (non-English)", False, False),
    ("Z8", "CSCD Times Cited Count", False, False),
    ("ZK", "Author Keywords (Korean)", False, False),

Please note, that the "splittable" and "one item per line" settings are not verified. I just thought I'd share this outside a PR in case anyone wants to pick this up and properly incorporate it.

TimRepke commented 2 months ago

There are even more tags now that are not even documented.

    # Undocumented tags
    ("ZS", "Magic undocumented tag", False, False),
    ("ZB", "Magic undocumented tag", False, False),
    ("ZR", "Magic undocumented tag", False, False),
    ("ZA", "Magic undocumented tag", False, False),
    ("ER", "Magic undocumented tag", False, False),
    ("EF", "Magic undocumented tag", False, False),
    ("G1", "Magic undocumented tag", False, False),
rafguns commented 2 months ago

Thanks for the report! I'll try to figure out what those tags represent in the coming days.