BdR76 / CSVLint

CSV Lint plug-in for Notepad++ for syntax highlighting, csv validation, automatic column and datatype detecting, fixed width datasets, change datetime format, decimal separator, sort data, count unique values, convert to xml, json, sql etc. A plugin for data cleaning and working with messy data files.
GNU General Public License v3.0
151 stars 8 forks source link

Cannot specify comment rows throughout the data #48

Closed anuragsodhi closed 1 year ago

anuragsodhi commented 1 year ago

It would be great to be able to specify 'comment row starts with' option to be able to color it differently,

BdR76 commented 1 year ago

Thanks for reporting the issue, but this is the same as reported in #46

anuragsodhi commented 1 year ago

Thanks, I dont think it is the same espcially if you have a csv file like the following:

This is a comment line. Dont read me

a,1 b,2 c,3

This is also a comment line. skip me as well.

d,4

In the file above, it would be good to have an option to specify skipping row starting with a character such as # in this case.

BdR76 commented 1 year ago

The latest version of the CSV Lint plug-in only supports comment lines at the beginning of the file. CSV files aren't formally standardized and comment lines in csv files are quite uncommon afaik. The only relevant Stackoverflow question I could find is from 10 years ago. So there isn't really a standard for csv comments and I added support for it just based on the description in this issue.

Is this something you encounter in csv files you are working with? I mean this isn't just a technical test so to speak? If that is the case, can you attach a test data file here, or if you don't want to share the data publicly then send me an e-mail with an example file with a representative amount of columns and comments?

BdR76 commented 1 year ago

Btw if you have real-life examples of a csv file that uses comment lines in the way you describe, then I think it's might be a good idea to upvote this Microsoft suggestion or leave a comment describing your use-case

anuragsodhi commented 1 year ago

The latest version of the CSV Lint plug-in only supports comment lines at the beginning of the file. CSV files aren't formally standardized and comment lines in csv files are quite uncommon afaik. The only relevant Stackoverflow question I could find is from 10 years ago. So there isn't really a standard for csv comments and I added support for it just based on the description in this issue.

Is this something you encounter in csv files you are working with? I mean this isn't just a technical test so to speak? If that is the case, can you attach a test data file here, or if you don't want to share the data publicly then send me an e-mail with an example file with a representative amount of columns and comments?

Yeah this is a real life use case that i encounter at work. Even python library pandas has similar functionality : image https://pandas.pydata.org/docs/reference/api/pandas.read_csv.html

As for data I gave you the sample data earlier, my data is very similar to that Even the other plugin for notepad++ csvquery has this option: image

BdR76 commented 1 year ago

Ok I see that makes sense, using an explicit comment character is a better way to skip comment lines than just a SkipLines option. But also, I just remembered that the current implementation of SkipLines is in part also based on this stackoverflow question which provides an example file cocomo81.arff. The cocomo81.arff file doesn't have a clear comment character and should just skip the first X lines as comments.

So, the plug-in should ideally support both cases 1) skip any line that start with # character (or ~ can be changed in settings) 2) skip the first X number of lines

anuragsodhi commented 1 year ago

Yes that would be great, thanks a lot!

BdR76 commented 1 year ago

Btw I just noticed that the Pandas library has both a skiprows and a skipfooter parameter, to skip lines at the start and/or at the end. The plugin should also include this skipfooter in some way.

BdR76 commented 1 year ago

@anuragsodhi I've added support for comment characters as you described in this issue.

Can you install the development DLL from here and see if this works as intended for your case?

anuragsodhi commented 1 year ago

@anuragsodhi I've added support for comment characters as you described in this issue.

Can you install the development DLL from here and see if this works as intended for your case?

Hi I am not able to load the plugin, it says cant load 32 bit version. I have tried both x64 and normal. same error

BdR76 commented 1 year ago

You mean at startup of Notepad++ you immediately get this error message?

[ C:\Program Files (x86)\Notepad++\plugins\CSVLint\CSVLint.dll ]

Cannot load 32-bit plugin.

CSVLint.dll is not compatible with the current version of Notepad++

This is the error you typically get when you've copied a 32bit plugin but with the 64bit Notepad++, or a 64bit plugin with the 32bit Notepad++ version. Can you check ? > about Notepad++, which Notepad++ version do you have, and which 32bit/64bit version is it?

Are you sure you've copied the correct version of the dll for your 32bit/64bit Notepad++ version, so either the 32-bit dll or the 64-bit dll?

anuragsodhi commented 1 year ago

yes, i am sure. I am on 64bit npp and downloaded the x64 version.

BdR76 commented 1 year ago

Ok but it's hard for me to say what's causing the error without additional info, it could be a lot things like Windows version, codepage setting, combination with other plugin etc.

When you remove the CSV Lint plugin, then open Notepad++ and go to ? > Debug info.. can you select Copy debug info into clipboard and post what you see there?

anuragsodhi commented 1 year ago

I have added back the 4.6.3 version and using that.

Notepad++ v8.4.6 (64-bit) Build time : Sep 25 2022 - 19:51:39 Path : C:\Dev\Tools\Notepad++\notepad++.exe Admin mode : OFF Local Conf mode : ON Cloud Config : OFF OS Name : Windows 10 Enterprise (64-bit) OS Version : 21H2 OS Build : 19044.2846 Current ANSI codepage : 1252 Plugins : ComparePlus (1.1) CSVLint (0.4.6.3) CsvQuery (1.2.9) FWDataViz (2.5) mimeTools (2.8) NppConverter (4.4) NppExport (0.4) NppFTP (0.29.10) NppXmlTreeviewPlugin (2) XMLTools (3.1.1.13)

BdR76 commented 1 year ago

I can't quite explain why you get a crash with the latest plugin. You have an older version of Notepad++ version 8.4.6, however that shouldn't really be an issue.

I have also installed Notepad++ v8.4.6 and tried several older versions of the CSV Lint plugin (see also here) and I couldn't find any issues.

If you open the folder %userprofile%\AppData\Roaming\Notepad++\plugins\config\ and you remove the files CSV Lint.ini and CSVLint.xml and then install the 0.4.6.4 plugin dll, and try Notepad++ again, does that help? Or if you open the folder %userprofile%\AppData\Roaming\Notepad++\ and you first rename the file session.xml to session.xml_old and then copy the csvlint dll and start Notepad++ again, does it still crash?

If that doesn't work, have you tried installing the latest version of Notepad++?

BdR76 commented 1 year ago

@anuragsodhi I just saw this comment, apparently Windows will sometimes block downloaded files. When you right click the CSV Lint DLL file, do you see an unblock checkmark?

anuragsodhi commented 1 year ago

Hi, I don't see an unblock checkmark (i dont think it is blocked), I never had any issue with other csv lint versions. It is just this 0.4.6.4 version. I'll try the steps of remove ini and xml files and report back.

BdR76 commented 1 year ago

In the meantime there have been some updates, including a bugfix that had to do with the transparent cursor line sometimes crashing the plugin, also see issue #69

You could try downloading the latest development DLL v0.4.6.5ẞ4 and see if that fixes you crashing issue as well.

Btw to be clear, the original issue "Cannot specify comment rows" was fixed, support for a comment character was added, see screenshot below which uses the # character for comment lines.

csvlint_skiplines

BdR76 commented 1 year ago

This comment row issue is fixed in the currently available version v0.4.6.4 and in the meantime the plugin was updated to v0.4.6.5, see the releases page, which also includes some bugfixes.

If you are still unable to load the plugin with the latest v0.4.6.5 dll please post a separate issue.

anuragsodhi commented 1 year ago

happy to report that i was able to load 0.4.6.5 version and the comments row issue is fixed now, thanks for all the work!

bucweat commented 6 months ago

Actually never mind. I figured out that I was 0.0.1 behind the current version but plugin admin was not offering me the newest version. I had to manually update notepad++ to latest 8.6.1 and then plugin admin had the latest version. Sorry for the noise...   Hi, I was looking for this feature and happy to find it in CSVLint :-)

One difference I see locally as compared to example above is the comments are colored with the first column color and not grey. And, if I have a line like

# audio, monitors

then monitors is colored with the color for column two. So it appears that CSVLint is parsing the commented lines.

Would it be possible to add ability to specify a comment color in the Settings > Style Configurator?

Notepad++ v8.6   (64-bit)
Build time : Nov 23 2023 - 16:58:44
Path : C:\Program Files\Notepad++\notepad++.exe
Command Line : "configSmall.csv"
Admin mode : OFF
Local Conf mode : OFF
Cloud Config : OFF
OS Name : Windows 11 Pro (64-bit)
OS Version : 21H2
OS Build : 22000.2538
Current ANSI codepage : 1252
Plugins : 
    CSVLint (0.4.6.5)
    JsonTools (5.6)
    MarkdownViewerPlusPlus (0.8.2)
    mimeTools (2.9)
    NppConverter (4.5)
    NppExport (0.4)
BdR76 commented 6 months ago

No problem about "the noise". But there wasn't anything changed to the comment-functionality between plugin version 4.6.5 and 4.6.6. So I think it might have had to do with the metadata for your specific csv file wasn't refreshed or something?

In the CSV Lint window there should be a line ;CommentChar=# in the metadata of that file, in order to tell the plugin to treat the comment lines as comment lines and apply the syntax highlighting just all white/grey.

bucweat commented 6 months ago

In the CSV Lint window there should be a line ;CommentChar=# in the metadata of that file, in order to tell the plugin to treat the comment lines as comment lines and apply the syntax highlighting just all white/grey.

Ah ok...I might have mucked with that particular line after the initial install before upgrading...I didn't really know what I was doing and so was poking at things...I would chalk it up to stupid user tricks then :-)