Closed ajkulkarni closed 8 years ago
Similar issue reference: https://github.com/diging/tethne/issues/85
Can you add some details about how to recreate this error? See this guide for reference
Expected: Successful bulk import with option to save collections Actual: Throwing UnicodeEncodeError as seen in attached screenshot
Last Commit SHA: a2b28a313650807d28212545661b6e7a387d6bea
Cool. So this happens when you click the "Submit" button on the bulk upload form?
Yes! The particular collection I created is always throwing this error. Should I discuss this with Nischal as he is working on a similar unicode encoding error?
Yes, work together on this. Thanks!
@nakapika For starters, you can look at this !!! http://www.joelonsoftware.com/articles/Unicode.html
Great article!
Erick Peirson Postdoctoral Scholar ASU-SFI Center for Biosocial Complexity Arizona State University
On Sep 18, 2015, at 4:37 PM, Nischal Samji notifications@github.com wrote:
@nakapika For starters, you can look at this !!! http://www.joelonsoftware.com/articles/Unicode.html
— Reply to this email directly or view it on GitHub.
Really good article! So does this mean that we should use UTF-8 in all our projects to prevent encoding errors?
@nischalsamji and I have been working on this for a while now. The problem is that the conference paper names have an apostrophe which is not encoded in utf-8 and hence it is breaking the code. We tried encoding all the text to utf-8 but it didn't work. We will continue to work on this today to find a permanent solution.
@nakapika @nischalsamji Thanks for tackling this. Don't the .rdf files have UTF-8 encoding when they are created?
@erickpeirson When the file contents has a unicode character, the parser works fine. If there is a unicode character in the file name, it is throwing an error.