crsh / citr

RStudio Addin to Insert Markdown Citations
Other
407 stars 46 forks source link

I cannot use citr addins for chinese citation #41

Open HelloRMarkdown opened 5 years ago

HelloRMarkdown commented 5 years ago

some citations like this, but I cannot use citr addins for corrputed character.

@article{Chung201901,
   author = {Chung, C. Y. and Kim, D. and Kim, K. S. and Lee, J. H. and Lee, K.},
   title = {Do Institutional Investors Enhance Accounting Earnings Attributes in the Korean Market?},
   journal = {Emerging Markets Finance and Trade},
   volume = {55},
   number = {1},
   pages = {39-58},
   ISSN = {1540-496X},
   DOI = {10.1080/1540496x.2018.1503081},
   url = {<Go to ISI>://WOS:000446108700004},
   year = {2019},
   type = {Journal Article}
}

@article{LiXL201809,
   author = {李小林 and 叶德珠 and 张子健},
   title = {CEO财务经历能否降低公司权益资本成本?},
   journal = {外国经济与管理},
   volume = {40},
   number = {09},
   pages = {96-111},
   ISSN = {1001-4950},
   year = {2018},
   type = {Journal Article}
}

The result like this: default

I look forward to your reply. Thanks.

crsh commented 5 years ago

Hi, I can't reproduce your problem. I created a bib file containing the above references and you used the citr-addin to insert the reference handles into the R Markdown document.


selection-box
document

Is the screenshot you attached from the rendered document or the selection box in the addin?

pmp55 commented 5 years ago

I have the same issue when I export the references from Zotero to .Bib file and click it in the selection box in the addin. Inadditon, a warning message is shown like this : Warning in strsplit(x, rx) : input string 1 is invalid in this locale

I guess it might be related to the character coding. My session info is as follows: R version 3.4.3 (2017-11-30) Platform: x86_64-w64-mingw32/x64 (64-bit) Running under: Windows >= 8 x64 (build 9200)

Matrix products: default

locale: [1] LC_COLLATE=Chinese (Simplified)_China.936 [2] LC_CTYPE=Chinese (Simplified)_China.936
[3] LC_MONETARY=Chinese (Simplified)_China.936 [4] LC_NUMERIC=C
[5] LC_TIME=Chinese (Simplified)_China.936

attached base packages: [1] stats graphics grDevices utils datasets methods base

loaded via a namespace (and not attached): [1] compiler_3.4.3 backports_1.1.2 magrittr_1.5 rprojroot_1.3-2 [5] htmltools_0.3.6 tools_3.4.3 yaml_2.2.0 Rcpp_0.12.17
[9] stringi_1.2.3 rmarkdown_1.10 knitr_1.20 stringr_1.3.1
[13] digest_0.6.15 evaluate_0.10.1

Thanks in advance for your suggestions and patience!

crsh commented 5 years ago

Thanks for chiming in. I could reproduce the warning. It originates in RefManageR::ReadBib which citr uses to import the bib-file. Hence, it would be a good idea to raise this issue with the developers of that package.

However, I'm still not sure what the issue you are experiencing is exactly. Do you have problems inserting the reference handles into the R Markdown file, is it just that references are not displayed correctly in the selection list, or are the references corrupted in the rendered document?

pmp55 commented 5 years ago

@crsh Thanks for your response. Attached is the screenshot for the inserting problem. the references is OK for the rendered document. Any suggestions to fix it ? image

crsh commented 5 years ago

I see, so it's only the preview in the search bar that doesn't get the encoding right. Thanks for clarifying. Could you try either storing your bib file with UTF-8 encoding (File > Save with Encoding) or changing the default encoding setting of citr to see if that fixes the issue? For example,

options(citr.encoding = "GB2312")
pmp55 commented 5 years ago

@crsh I just try it using 2 suggestions you've mentioned. Unfortunately, the problem still there. Could you show me your sessionInfo to give me any hints? Thanks!

crsh commented 5 years ago

Hm, could you try changing the locale of a new R session before calling the addin?

Sys.setlocale(category="LC_ALL", locale = "en_US.UTF-8")
pmp55 commented 5 years ago

@crsh When I change the locale of a new R session as your recommended. Warning measseage like this is shown: Warning message: In Sys.setlocale(category = "LC_ALL", locale = "en_US.UTF-8") : OS reports request to set locale to "en_US.UTF-8" cannot be honored

In addition, I found RefManageR::ReadBib used in your source code is probably the root cause. Would it be possible to modify the source lines of code in RefMagageR to fix it ?

crsh commented 5 years ago

In addition, I found RefManageR::ReadBib used in your source code is probably the root cause. Would it be possible to modify the source lines of code in RefMagageR to fix it ?

Could you elaborate on what exactly you mean?

crsh commented 5 years ago

I just pushed an update that may help. Could you give it a try?

crsh commented 5 years ago

I just pushed a commit to fix what I think is a related issue. Could you install the latest development version and let me know if this fixes the issue for you?

no-response[bot] commented 5 years ago

This issue has been automatically closed because there has been no response to our request for more information from the original author. With only the information that is currently in the issue, we don't have enough information to take action. Please reach out if you have or find the answers we need so that we can investigate further.

sammo3182 commented 5 years ago

@crsh , the latest development version still produced weird preview in my system:

When using the default English locale: image

And when setting Sys.setlocale(, "Chinese"): image

The .bib file is attached and saved in UTF-8. I guess RefManageR might still be the problem, since it might never fixed, as the issue indicated. I wonder how you showed the correct characters, though, here

CNKI-636862711945718750.zip

Here's my sessioninfo:

R version 3.5.2 (2018-12-20)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 17763)

Matrix products: default

locale:
[1] LC_COLLATE=Chinese (Simplified)_China.936 
[2] LC_CTYPE=Chinese (Simplified)_China.936   
[3] LC_MONETARY=Chinese (Simplified)_China.936
[4] LC_NUMERIC=C                              
[5] LC_TIME=Chinese (Simplified)_China.936    

attached base packages:
[1] stats     graphics  grDevices
[4] utils     datasets  methods  
[7] base     

other attached packages:
[1] shiny_1.2.0
crsh commented 5 years ago

Thanks, I'll take another look as soon as I can.

crsh commented 5 years ago

@sammo3182

Okay, this has been a long while, but I finally made some time to look into this. This is still giving me headaches, but I think I'm one step closer. Some of the problem seems to be with RefManageR::ReadBib().

With my default English locale I can read the file, but with the some encoding problems you see in the addin.

RefManageR::ReadBib("CNKI-636862711945718750.bib")
[1] ç. 谢昕. “基于公共行政理念的政府公共关系发展历程探析”. In:
_湖北社会科学_ (09 2005). 42-1112/C, pp. 14-16. ISSN: 1003-8477.

After I set the locale to simplified Chinese, it seems there is a problem parsing the .bib-file:

Sys.setlocale('LC_ALL','chs')

RefManageR::ReadBib("CNKI-636862711945718750.bib")
Ignoring entry 'XieXin2005'  (line1) because:
    A bibentry of bibtype ‘Article’ has to specify the field: author
Warning message:
In strsplit(x, rx) : input string 1 is invalid in this locale

Note that the author field is provided in the file and is parsed without problems (except for encoding) when the locale is English.

Reading the file with checks disable confirms that some fields get mixed up during reading:

bib <- RefManageR::ReadBib("CNKI-636862711945718750.bib", check = FALSE)
str(bib)
Classes 'BibEntry', 'bibentry'  hidden list of 1
 $ :List of 12
  ..$ title           : chr "鍩轰簬鍏叡琛屾斂鐞嗗康鐨勬斂搴滃叕鍏卞叧绯诲彂灞曞巻绋嬫帰鏋\x90"
  ..$ journal         : chr "婀栧寳绀句細绉戝"
  ..$ year            : chr "2005"
  ..$ authoraddress   : chr "涓浗鍦拌川澶у鏀挎硶瀛﹂櫌,涓浗鍦拌川澶у鏀挎硶瀛﹂櫌 婀栧寳姝︽眽430074,婀栧寳姝︽眽430074"
  ..$ issue           : chr "09"
  ..$ pages           : chr "14-16"
  ..$ keywords        : chr "鏀垮簻鍏叡鍏崇郴;;鍏叡琛屾斂;;鍙戝睍鍘嗙▼;;鐞嗗康"
  ..$ abstract        : chr "鍦ㄥ叕鍏辫鏀跨殑鍘嗗彶闀挎渤涓\xad,鏀垮簻鍏叡鍏崇郴鏄竴涓柊鐢熺殑浜嬬墿,瀹冧即闅忕潃鍏叡琛屾斂鐨勫彂灞曡€屽彂灞\x95,缁忓巻
  ..$ isbn            : chr "1003-8477"
  ..$ issn            : chr "1003-8477"
  ..$ notes           : chr "42-1112/C"
  ..$ databaseprovider: chr "CNKI"
  ..- attr(*, "bibtype")= chr "Article"
  ..- attr(*, "key")= chr "XieXin2005"
  ..- attr(*, "dateobj")= POSIXct[1:1], format: "2005-01-01"

If I understand correctly, this is the information that is displayed by the citr addin once you set the locale to simplified Chinese (your second screenshot).

Maybe @mwmclean has some idea of what's going on here?