Beakerboy / VBA-SQL-Library

Object-based Database Interaction for VBA. Create SQL Statements with VBA objects instead of string concatination.
69 stars 16 forks source link

About line endings #56

Open DecimalTurn opened 11 months ago

DecimalTurn commented 11 months ago

I notice that in this PR https://github.com/Beakerboy/VBA-SQL-Library/pull/43, you added a .gitattributes file that makes it so that line endings are converted to LF in the Git index.

I've personally read in a few places that it was a good practice as well, but I've recently come to the conclusion that normalizing line endings to LF for VBA files doesn't actually bring any value because GitHub is already smart enough to deal with CRLF line endings.

The problem with eol=clrf is that the conversion to CRLF only occurs when someone clones the repo or downloads it as a .zip file. GitHub won't make the line ending conversion when someone downloads a single file using the "Download raw file" option in the GitHub interface.

Then, people that don't know about this problem will inevitably try to down a single file from the project and get an error when trying to import it in the VBE. For example: https://github.com/VBA-tools/VBA-Dictionary/issues/38

For that reason and because there is no known advantages to convert to LF, I'd suggest to change the .gitattributes file to avoid applying line conversion.

Potential approaches

  1. Mark the vba files with -text to avoid line endings conversion.

This is the simplest way to avoid most problems. The only downside of this approach is that it doesn't stop anyone from uploading a file with LF.

  1. Approach 1 + Use a filter to force the conversion to CRLF.

This approach is particularily useful if you make edits via VScode and don't what to introduce LF by mistake. However, it requires contributors to run a command to inlcude the .gitconfig to their local config.

For example:

.gitattributes:

# By default, auto detect text files and perform LF normalization
* text=auto eol=lf

# Important: To make sure the crlf filter is active, run the following command at the root of the git repo: git config include.path ../.gitconfig
# (The reason why we need the ".." is to move one folder up because the config file is located in the .git subfolder.)

# VBA extensions - Enforce CRLF using a filter
*.[bB][aA][sS]      filter=crlf -text working-tree-encoding=CP1252 
*.[cC][lL][sS]      filter=crlf -text working-tree-encoding=CP1252 
*.[fF][rR][mM]      filter=crlf -text working-tree-encoding=CP1252
*.[vV][bB][aA]      filter=crlf -text working-tree-encoding=CP1252

# VBA extensions - Mark as binary
*.[fF][rR][xX]      binary

.gitconfig:

[filter "crlf"]
        clean = unix2dos
  1. Use a GitHub Action to enforce CRLF

This would be the only full proof method to maintain consitent line endings.

I think that would be a good thing to add this option to https://github.com/Vba-actions/lint-vba. I might actually have some python code I could contribute if you are interested to add this feature.

Beakerboy commented 11 months ago

Thank you for the well written thought out message. My long term goal is to allow users to develop VBA projects with git, and use actions to compile them into usable excel addin files. I also personally tend to use GitHub as my IDE, so saving files as text is not ideal. So this process requires checked out code with the correct endings that can be compressed to to the OLE file. Also, I tend to copy and paste code from online sources into the excel VBA editor versus using the actual “import” method, so the line endings have never been a problem. Maybe the best approach is to add a note to my README files informing people about the line ending status.

DecimalTurn commented 11 months ago

That's a very commendable goal and I'll gladly help if I think I can make a positive difference towards it! However, I don't see how what I'm suggesting goes against your vision, but maybe I'm misunderstanding something. Perhaps, I should have asked: what was the goal when adding line conversion in the .gitattributes file?

Beakerboy commented 11 months ago

I don't see how what I'm suggesting goes against your vision

more then likely I’ve misunderstood some of what you said then, since I’m not much of an expert in these things. I’ll have to Google how the .gitattributes and .gitconfig files work. Is unix2dos available on non-Linux dev environments?

in the past when I’ve had files with CRLF, editing them through the web interface was pretty buggy. They’ve updated the interface so that might not be a problem anymore. It would add a ton of tab indentations automatically regardless of what setting was selected. For example, if you opens a function, and then closed it with “End Function” it would continue to indent the next function by one level. It would do the same for loops. This would only happen with CRLF.

Beakerboy commented 11 months ago

I’m happy to accept any pull requests for any project. One of my plans is to expand the lint-VBA project to more thorough analysis of the ANTLR parse tree. Inspecting the line endings would probably be the easiest first step on this path.

DecimalTurn commented 11 months ago

Regarding how things work with the .gitattributes and .gitconfig file, I should clarify 2 things:

First, -text doesn't mean that git will treat the files as binary, it just prevents the enabling of line conversion per the docs:

This attribute marks the path as a text file, which enables end-of-line conversion

Second, the .gitconfig file doesn't have a special status, it could've been named anything, the important part is to make sure that the file config located inside the local .git folder contains a line to include it.

DecimalTurn commented 11 months ago

Is unix2dos available on non-Linux dev environments?

Yes, it is part of the many binary tools included in Git for Windows, mine is located in C:\Program Files\Git\usr\bin.

However, it is not available when editing the file directly on GitHub which is why the 3rd approach might be the best way to go about all this.

in the past when I’ve had files with CRLF, editing them through the web interface was pretty buggy

I didn't know about this. Can you confirm that it's fixed? If that's still the case, that does make the decision regarding line endings more of a trade-off than a clear-cut answer. Personally, I've never seen any automatic indentations added in the rare cases I would edit a VBA file directly on GitHub. Are we talking about the new VScode web editor?

Beakerboy commented 11 months ago

new VScode web editor?

no, definitely on the previous version. There is a more “IDE-like” web editor, but this was the standard, glorified text editor. I don’t know what VSCode is..,I guess I’ll Google it.

That's a very commendable goal and I'll gladly help if I think I can make a positive difference towards it!

My big hurdle right now is getting a vbaproject.bin that is recognized by Excel. The python oletools program gives it the thumbs up, but Excel says it’s corrupt. My VBA-Project-Compiler project can create the OLE container, but I really need to make it also deconstruct and report back on the characteristics of the file to more easily check if it is constructed correctly. If you know python, or want to learn python, I would love help there.

DecimalTurn commented 11 months ago

I don’t know what VSCode is..,I guess I’ll Google it.

I'm referring to the new GitHub web editor that you get when you go to this page for instance: https://github.dev/Beakerboy/VBA-SQL-Library

It's not indicated anywhere that it's VS Code (aka. Visual Studio Code), but it is basically a web version of the desktop version of VS Code.

...getting a vbaproject.bin that is recognized by Excel

I'm no expert, so not sure how helpful I can be. If you can make a small reproducible example, it could be a good thing to post about on Stack Overflow.

DecimalTurn commented 11 months ago

Anyhow, regarding line endings, if you prefer to keep LF endings on GitHub to prevent potential bugs with the interface, it's totally justified and I won't impose my method.

Maybe the best approach is to add a note to my README files informing people about the line ending status.

This is prefectly OK.


Note however that currently the line endings in the Git index are still CRLF for bas/frm/cls files. This means that as soon as someone does a checkout of the master branch, Git will try to perform the line conversion specified by the .gitattributes and mark the files as containing changes.

For instance, if I clone the repo and do a git status right after, I see that all these files have "pending changes" :

image

I'd suggest to commit those line ending changes to avoid confusion for people looking at the project for the first time. If you don't see these changes, you can always run the command :

git add --renormalize .

Reference: How to normalize working tree line endings in Git?

Beakerboy commented 11 months ago

Note however that currently the line endings in the Git index are still CRLF for bas/frm/cls files

thanks. I thought I checked and changed them all in pull request #44

Beakerboy commented 11 months ago

Feel free to make changes the way you would if it were your project and submit a pull request. Then I could check out that branch and mess around with it.

I doubt stack overflow would help with my project. It’s very much in the weeds. I’ve posed a question in the GitHub discussion area of a similar project, but nothing but crickets so far.

I didn’t know a bit of python, nor anything about OLE containers before starting any of this, so I’m more than happy to have anyone interested and motivated helping in any way they want. I have discussions turned on for all the projects, so feel free to ask questions where appropriate. The utilities are:

my goal is 100% code coverage and testing as many edge cases as I can come up with, in as isolated of a manner as possible. If you DO know python and have a knack for mocks, dependency injection, and interfaces, do whatever you think is good.

For example, this file has only 80% code coverage. By creating tests you would learn how the tool works and might be able to either learn python, spot a bug, or create a test that fails which would flag a bug.

DecimalTurn commented 11 months ago

I’ve posed a question in the GitHub discussion area of a similar project, but nothing but crickets so far.

I was going to suggest having a look at https://github.com/decalage2/olefile, but I see that's where you posted already 😅.

Beakerboy commented 11 months ago

Yeah, I was hoping someone would like to stress test it. The oletools package is to my VbaProject-Compiler project what olefile is to MS-CFB. However, MS-CFB can both assemble and disassemble OLE containers while olefile can only disassemble them.

I’m working on adding VBA OLE disassemble to VBAProject-Compiler, at which point I’m changing the name to MS-OVBA. Adding disassembly will also help with testing.

Beakerboy commented 11 months ago

My Linting project is able to identify incorrect line endings in files. With this tool on hand, what would you propose as the best strategy? I’m guessing you’ll say save the files with CRLF in the repository and use the Linting tool to ensure they stay that way.

DecimalTurn commented 11 months ago

Nice. Will that be incorporated into your github action that is on the marketplace?

I’m guessing you’ll say save the files with CRLF in the repository and use the Linting tool to ensure they stay that way.

Yes, that's an accurate guess.

Beakerboy commented 11 months ago

Will that be incorporated into your github action … ?

Yes. The dev branch works, but is not release ready. The main-directory branch is where I’m currently doing my work. Look over the list of checks and let me know if you want anything added.