jtextor / dagitty

Graphical analysis of structural causal models / graphical causal models.
GNU General Public License v2.0
255 stars 47 forks source link

Using ⟂ to indicate independence #72

Open behrman opened 1 year ago

behrman commented 1 year ago

@jtextor For consistency with the dagitty web interface and the most commonly used mathematical notation, and for better readability, I recommend that the dagitty R package use the ⟂ symbol instead of the current _||_ to indicate independence. R supports Unicode, as have all major operating systems for many years.

Unicode actually has at least three related symbols:

I'd recommend using the perpendicular symbol. It's the symbol used by \perp in LaTex and by the Python package with the closest functionality to dagitty.

I believe the required change would be:

https://github.com/jtextor/dagitty/blob/ca4ec745ccfeaf8d283543c978e3691178748279/r/R/dagitty.r#L2548

" _||_ " -> " \U27C2 "

jtextor commented 1 year ago

I'd rather not make this change. Use of non-ASCII characters in R packages is still discouraged, and I have gotten complaints from CRAN maintainers not too long ago for attempting to use them. See https://cran.r-project.org/doc/manuals/r-release/R-exts.html#Encoding-issues

behrman commented 1 year ago

@jtextor I completely agree that you want to have the dagitty R package work for those with older versions of R. The dagitty package DESCRIPTION file currently specifies R (>= 3.0.0). According to the CRAN manual section you linked to, the Unicode escape \uxxxx is "a portable way to have arbitrary text in character strings (only) in your R code" for R (>= 2.10).

It may be useful to check with the CRAN maintainers to see if having the mathematical symbol \u27c2 in a character string in your R code would present a portability problem for those running R (>= 3.0.0). Rather than never adopting an advance, such as Unicode, the manual seems to suggest the approach of detecting older systems and making character substitutions when necessary.