seboettg / citeproc-php

Full-featured CSL 1.0.1 processor for PHP
MIT License
76 stars 39 forks source link

Convert array to string before cleaning up value. (#184) #185

Open hktang opened 9 months ago

hktang commented 9 months ago

This is a quick fix assuming the supplied value is an array of strings. I had to apply this patch for my application to work. Please kindly review. Suggestions much appreciated!

Ref: #184

hktang commented 6 months ago

@seboettg Thanks again for your reply. Having a closer look, it seems the problematic item is "Copyright", which is an array containing one (or potentially more) objects, as indicated below.

I wonder whether we can look at the upstream and check how this complex array/object ends up here inside htmlspecialchars? I am sorry I am not familiar with the codebase at the moment. If you could point me where to look I am happy to help.

At the moment, I added a test case, and hardcoded a fix specific for the license field, i.e. just extracting the URL from the license. I don't think it's an ideal fix at all, and your feedback would be appreciated. Maybe, we could even ignore the license field as Bibtex does not enforce it.

 "license": [
            {
                "URL": "https://creativecommons.org/licenses/by/4.0/",
                "content-version": "vor",
                "delay-in-days": 0,
                "start": {
                    "date-parts": [
                        [
                            2023,
                            11,
                            8
                        ]
                    ],
                    "date-time": "2023-11-08T00:00:00Z",
                    "timestamp": 1699401600000
                }
            }
        ],
hktang commented 6 months ago

Perhaps, it's worth checking for a license field like below, inside the render() function, and add the string extraction logic to a new function renderLicense()? The thing is, License is not defined in the CSL data schema.

elseif ($this->toRenderTypeValue === "license") {
    $renderedText = $this->renderLicense($data->{$this->toRenderTypeValue});
    break;
}

After implementing the above, I found the ISSN field in this example is still causing complaints, as it is an array containing a string, like ["2071-1050"], breaking htmlspecialchars() again. Having checked the CSL data schema, it seems ISSN/ISBN must be a string. So, this is getting interesting.

For your reference, the JSON is retrieved from DOI content negotiation. I wonder why it returns data in non-compliant format...

curl -LH "Accept: application/vnd.citationstyles.csl+json, application/rdf+xml" https://doi.org/10.3390/su152215733