simonsteele / pn

Programmer's Notepad
372 stars 115 forks source link

programmer's notepad #208

Closed Bielzebub1981 closed 2 years ago

Bielzebub1981 commented 2 years ago

when trying to make document regarding Eurovision, trying to do it in such a way that the information for each country or act is like a 5 or 6 line story. Main problem bing that programmer notepad seems unable to recognise language text such as that used in Yugoslavian, seemingly referred to as Serbo-Croatian (https://en.wikipedia.org/wiki/Serbo-Croatian)

veganaize commented 2 years ago

Seems to work for me... image

Which font are you using (in Tools->Options->Fonts and Colours)? I'm using Courier New.

Your font will need to include support for those characters.

Bielzebub1981 commented 2 years ago

copied the below from Eurovision 1962 Participants and results

However, as you can see, there are a lot of ? marks yet font is set to Courier New

Yugoslavia Lola Novakovic "Ne pali svetla u sumrak" (?? ???? ?????? ? ??????) Serbo-Croatian 4 10

veganaize commented 2 years ago

Tools -> Options -> General -> Defaults... image

https://www.fontsquirrel.com/fonts/list/find_fonts?filter%5Blanguages%5D%5B0%5D=croatian&filter%5Blanguages%5D%5B1%5D=serbian

It's still working for me (with Courier New)... image

Bielzebub1981 commented 2 years ago

image

as you can see here, the standard WINDOWS TEXT application posted the Не пали светла у сумрак so why can't Programmer's notepad ?

Current Font: Courier New


Edition: Windows 10 Enterprise Version: 21H2 Installed on: 14/‎03/‎2022 OS build: 19044.1806 Experience: Windows Feature Experience Pack 120.2212.4180.0

veganaize commented 2 years ago

image

Bielzebub1981 commented 2 years ago

UTF-No Mark seems to have sorted the issue 😄

On looking at the encoding options under file, ANSI UTF-8 UTF-16 Big Endian UTF-16 Little Endian UTF-No Mark

then, if you go along to Tools > Options> General and click on Default and down to Encoding ANSI UTF-8 UTF-8 No BOM UTF-16 Big Endian UTF-16 Little Endian

how come one has No Mark, but other has No BOM ?

veganaize commented 2 years ago

BOM means "byte order mark". It's intention's to signify the order in which to expect the bytes of each character. But it tends to cause more trouble than it's worth with UTF-8... So it's generally recommended to avoid it.

https://www.unicode.org/faq/utf_bom.html

I'm glad your issue has been corrected!