TIBHannover / BacDiveR

Inofficial R client for the DSMZ's Bacterial Diversity Metadatabase (former contact: @katrinleinweber). https://api.bacdive.dsmz.de/client_examples seems to be the official alternatives.
https://TIBHannover.GitHub.io/BacDiveR/
MIT License
10 stars 12 forks source link

Remove invalid \n in JSON #43

Closed katrinleinweber closed 6 years ago

katrinleinweber commented 6 years ago

While implementing #31 and switching from rjson to jsonlite I noticed that some fields contain a insufficiently escaped \ns. This results in lexical error: invalid character inside string..

@ceb15: Please consider ensuring that those are escaped as \\n already BacDive or (I presume) during JSON serialisation.

screen shot 2018-03-20 at 11 00 02

I'll parse them away for now.

katrinleinweber commented 6 years ago

Ah, no, sorry! It might not be an issue in the BacDive data, but in the way R expects the escape sequences: https://github.com/jeroen/jsonlite/issues/47. I wonder whether it might be possible to detect an R client, and then deliver \\n instead of \n dynamically?

katrinleinweber commented 6 years ago

Fixed by cae151dd8fc448a9c8621c7df7ea06009d153579.

katrinleinweber commented 6 years ago

Whoa! The above dataset doesn't contain that sample any more:

screen shot 2018-07-31 at 12 58 54

katrinleinweber commented 6 years ago

https://github.com/TIBHannover/BacDiveR/blob/cae151dd8fc448a9c8621c7df7ea06009d153579/R/retrieve_data.R#L94-L99

is a bit stupid. There should be a specific purging of invalidly escaped characters, not of all space characters.

katrinleinweber commented 6 years ago

Since this changes the output datasets, it might as well be the reason to bump the semantic version to 1.0. #14 also supports this.

katrinleinweber commented 6 years ago

tested with retrieve_data("Bacillus"), which downloaded about 1.3k datasets without escaping errors.

katrinleinweber commented 6 years ago

No update to v1.0 after all, since some other kinks should be ironed out first (#68, #84, etc). v0.5 it is!