Open crew102 opened 6 years ago
Yes, that indeed seems like a bug! Thanks for the heads up, we will let you know when it is corrected!
Thanks for responding Sarah. The reason for this bug is because of quote escapes as we parse first into tsv and only then import into MySQL. Fields separated by quotes. I suggest we fix that at the DB stage as part of our transformation routine.
On Oct 18, 2017, at 7:48 PM, sarahkelley notifications@github.com wrote:
Yes, that indeed seems like a bug! Thanks for the heads up, we will let you know when it is corrected!
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub, or mute the thread.
Thanks for the suggestion Evgeny, that makes a lot of sense!
Sarah: an even easier solution - use mediumtext or longtext in the DB to store detailed descriptions http://boolean.co.nz/blog/max-length-for-mysql-text-field-types/135/
On Oct 18, 2017, at 7:48 PM, sarahkelley notifications@github.com wrote:
Yes, that indeed seems like a bug! Thanks for the heads up, we will let you know when it is corrected!
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub, or mute the thread.
It looks like there is an issue with how PatentsView handles quotation marks. For example, whenever a quotation mark occurs in the patent's title, PatentsView quotes the entire title and adds extra quotation marks around the actual quoted text. You can see this behavior in patent number 5767337:
The same behavior is seen in the bulk data files.