mate-desktop / pluma

A powerful text editor for MATE
http://www.mate-desktop.org
GNU General Public License v2.0
157 stars 65 forks source link

C# highlighting isn't in effect until a line is edited #573

Open fcard opened 4 years ago

fcard commented 4 years ago

Expected behaviour

As soon as a .cs file is opened, the syntax highlighting is applied to the entire file. This correctly happens to any other file type I tried, including .c, .cpp, .h, .java, and .bash.

Actual behaviour

After the file is opened it is initially displayed without any syntax highlighting, with the exception maybe of the first line. Whenever you edit a line, that line gets syntax highlighting. Pressing ctrl+A then tab returns the syntax highlighting by editing the whole file.

Gif

Steps to reproduce the behaviour

Open a C# file (.cs).

MATE general version

From mate-about: 1.24.0

Package version

Tried with both 1.24.0 and master (commit bba42125fb0d1dd10246e20e0b1156b5c40620f7), same result.

1.24.0 came pre-installed with distro.

master was installed like this:

$ cd ~/Source/pluma
$ ./autogen.sh
$ ./configure --prefix=$HOME/Prefix
$ make
$ make install

With a shell script to setup PATH,LIBRARY_PATH, and XDG_DATA_DIRS to reference the appropriate $HOME/Prefix subdirectories. (with priority over system directories)

Linux Distribution

Ubuntu 20.04.1 LTS

sleeveroller commented 3 years ago

I can reproduce this issue with:

Ubuntu 20.04 Pluma 1.24.0

It only seems to happen if the file starts with the unicode BOM.

I've attached 2 test files one with the BOM and one without. Textually the files are identical. But the file with the unicode BOM only has the first line syntax highlighted. The file without the BOM has both lines with syntax highlighting immediately upon loading the file.

test.cs.zip

fcard commented 3 years ago

I didn't realize until you said it, but it seems that the actual issue for me was that they were files created by Unity3D, rather than them just being c# files. A c# file created by some other means (like pluma itself) works as intended.

I copied a .cs file from a unity project and renamed it to a variety of other filetypes and it carried over this effect, i.e. it detected the filetype's language but only highlighted the first line until the other lines were edited, so it's not actually restricted to c#. The only exception I found was .c, which made pluma "highlight" the file as plain text.

lukefromdc commented 3 years ago

The question then becomes what is Unity3D adding to the file that is not present when you use Pluma or another text editor? That unicode BOM? Does this also happen if you copy that in from Pluma?

fcard commented 3 years ago

Yes, it does seem like unity adds the UTF-8 unicode BOM, 0xef, 0xbb, 0xbf

$ xxd -c 1 PlayerShip.cs | head
00000000: ef  .
00000001: bb  .
00000002: bf  .
00000003: 75  u
00000004: 73  s
00000005: 69  i
00000006: 6e  n
00000007: 67  g
00000008: 20   
00000009: 53  S

I copied it into the start of a random file and it did indeed cause the same issues.

lukefromdc commented 3 years ago

I will have to leave fixing this to the other devs who understand this part of the code, but at least we have confirmed what causes the problem

rbuj commented 3 years ago

Pluma does not highlight files with BOM. #357 https://github.com/mate-desktop/pluma/blob/bba42125fb0d1dd10246e20e0b1156b5c40620f7/pluma/pluma-document.c#L727

Test files: test_bom.cs and test.cs

#!/bin/bash
function body {
LANG=en_US.UTF-8 cat << EOF >> $1
using System;

namespace HelloWorld
{
    class Program
    {
        static void Main(string[] args)
        {
            Console.WriteLine("Hello World!");
        }
    }
}
EOF
}

rm -f test_bom.cs
echo -ne '\xEF\xBB\xBF' > test_bom.cs
body test_bom.cs

rm -f test.cs
body test.cs

Adding c-sharp in bom_langs list:

diff --git a/pluma/pluma-document.c b/pluma/pluma-document.c
index a7691f4..a21aff1 100644
--- a/pluma/pluma-document.c
+++ b/pluma/pluma-document.c
@@ -701,8 +701,8 @@ set_language (PlumaDocument     *doc,
    GtkSourceLanguage *old_lang;
    const gchar       *new_lang_id;
    const gchar       *bom_langs[] = {
-       "asp", "dtl", "docbook", "html", "mxml", "mallard", "markdown",
-       "mediawiki", "php", "tera", "xml", "xslt", NULL
+       "asp", "c-sharp", "dtl", "docbook", "html", "mxml", "mallard",
+       "markdown", "mediawiki", "php", "tera", "xml", "xslt", NULL
    };
    gboolean is_bom_lang = FALSE;

if we omit this restriction then it looks like a gtk_source_buffer_set_highlight_syntax issue with c-sharp and BOM.

Removing BOM check

diff --git a/pluma/pluma-document.c b/pluma/pluma-document.c
index a7691f4..cf7457c 100644
--- a/pluma/pluma-document.c
+++ b/pluma/pluma-document.c
@@ -650,61 +650,12 @@ pluma_document_class_init (PlumaDocumentClass *klass)
                  GTK_TYPE_TEXT_ITER | G_SIGNAL_TYPE_STATIC_SCOPE);
 }

-static gboolean
-file_with_bom (GFile *file)
-{
-   FILE    *testfile;
-   gchar    c;
-   int      i;
-   gchar    bom[3];
-   gchar   *file_path;
-
-   bom[0] = bom[1] = bom[2] = 0;
-
-   file_path = g_file_get_path (file);
-
-   testfile = fopen (file_path, "r");
-
-   g_free (file_path);
-
-   if (testfile == NULL)
-   {
-       perror ("fopen");
-       return FALSE;
-   }
-
-   for (i = 0; i < 3; i++)
-   {
-       c = fgetc (testfile);
-
-       if (c == EOF)
-           break;
-       else
-           bom[i] = c;
-   }
-
-   fclose (testfile);
-
-   if ((bom[0] == '\357') &&
-       (bom[1] == '\273') &&
-       (bom[2] == '\277'))
-       return TRUE;
-   else
-       return FALSE;
-}
-
 static void
 set_language (PlumaDocument     *doc,
               GtkSourceLanguage *lang,
               gboolean           set_by_user)
 {
    GtkSourceLanguage *old_lang;
-   const gchar       *new_lang_id;
-   const gchar       *bom_langs[] = {
-       "asp", "dtl", "docbook", "html", "mxml", "mallard", "markdown",
-       "mediawiki", "php", "tera", "xml", "xslt", NULL
-   };
-   gboolean is_bom_lang = FALSE;

    pluma_debug (DEBUG_DOCUMENT);

@@ -713,27 +664,7 @@ set_language (PlumaDocument     *doc,
    if (old_lang == lang)
        return;

-   new_lang_id = gtk_source_language_get_id (lang);
-   if (new_lang_id)
-       is_bom_lang = g_strv_contains (bom_langs, new_lang_id);
-
-   if (is_bom_lang)
-   {
-       GFile *file;
-
-       file = pluma_document_get_location (doc);
-       if (file)
-       {
-           if (!file_with_bom (file))
-               gtk_source_buffer_set_language (GTK_SOURCE_BUFFER (doc), lang);
-
-           g_object_unref (file);
-       }
-       else
-           gtk_source_buffer_set_language (GTK_SOURCE_BUFFER (doc), lang);
-   }
-   else
-       gtk_source_buffer_set_language (GTK_SOURCE_BUFFER (doc), lang);
+   gtk_source_buffer_set_language (GTK_SOURCE_BUFFER (doc), lang);

    if (lang != NULL)
    {
sleeveroller commented 3 years ago

Yes, I think this is a problem with the upstream gnome gtksourceview project.

I've just tried the same files with gedit and I get the same problem. First line only is highlighted when there is a BOM.

fcard commented 3 years ago

Possibly related: https://gitlab.gnome.org/GNOME/gtksourceview/-/issues/30 if that's the same crash as the one fixed by #357, it seems to have been fixed upstream. I tried a pluma with the "Removing BOM check" patch above and it seems to open all test files without issue. Still no good on the actual issue at hand, of course.

rbuj commented 3 years ago

@fcard Thank you very much for the find, PR #574

mbkma commented 3 years ago

@rbuj This issue can be closed?

rbuj commented 3 years ago

No, the issue about the highlighting cs files with bom has not been resolved yet. In the files packed in test.zip, test.cs is highlighted correctly, but test_bom.cs only highlights the first line.

test.zip

rbuj commented 3 years ago

The file attached in https://gitlab.gnome.org/GNOME/gtksourceview/-/issues/30 is highlighted correctly, so I think it's a GtkSourceView problem.

$ wget -q https://gitlab.gnome.org/GNOME/gtksourceview/uploads/7a547f881776e85c92e3732d75f1087f/wp-config.php
$ file wp-config.php
wp-config.php: PHP script, UTF-8 Unicode (with BOM) text
$ pluma wp-config.php
fcard commented 2 years ago

Hi! This seems to have been fixed upstream starting with GtkSourceView 4.8.1. I just installed 4.8.3 and it highlights the problematic file flawlessly.

Relevant issue on the gtksourceview repo: https://gitlab.gnome.org/GNOME/gtksourceview/-/issues/168

I haven't noticed until now because my distro is still on 4.6, but I was looking back on my old issues and decided to check 😊