continuedev / continue

⏩ Continue is the leading open-source AI code assistant. You can connect any models and any context to build custom autocomplete and chat experiences inside VS Code and JetBrains
https://docs.continue.dev/
Apache License 2.0
13.24k stars 918 forks source link

Continue doesn't respect `.gitignore` #1196

Open romiras opened 2 months ago

romiras commented 2 months ago

Before submitting your bug report

Relevant environment info

- OS: Ubuntu Linux 20.04
- Continue: v0.8.25
- IDE: VS Code 1.88.1

Description

It seems that Continue doesn't respect .gitignore at all. Every ignored file is leaked into indexing.

To reproduce

Generate RoR application

.gitignore:

# Ignore bundler config.
/.bundle

# Ignore all logfiles and tempfiles.
/log/*
/tmp/*
!/log/.keep
!/tmp/.keep

# Ignore pidfiles, but keep the directory.
/tmp/pids/*
!/tmp/pids/
!/tmp/pids/.keep

# Ignore uploaded files in development.
/storage/*
!/storage/.keep
.byebug_history

# Ignore master key for decrypting credentials and more.
/config/master.key

.env*

After long process of indexing du -hs ~/.continue/index reports about 144M. After running echo 'ignore me' > tmp/ignore.txt a file indexed also. Command sqlite3 -column ~/.continue/index/index.sqlite "select * from tag_catalog where path like '%ignore.txt'" shows:

22          /home/user/Projects/r7_app  main        chunks      /home/user/Projects/r7_app/config/master.key  20cc3b0a2108ccc75874833530a741f5b82c89c27a8876cda8933a5981cd8241  1714219436499
1494        /home/user/Projects/r7_app  main        vectordb::  /home/user/Projects/r7_app/config/master.key  20cc3b0a2108ccc75874833530a741f5b82c89c27a8876cda8933a5981cd8241  1714219449950
2900        /home/user/Projects/r7_app  main        sqliteFts   /home/user/Projects/r7_app/config/master.key  20cc3b0a2108ccc75874833530a741f5b82c89c27a8876cda8933a5981cd8241  1714219963517
4371        /home/user/Projects/r7_app  main        codeSnippe  /home/user/Projects/r7_app/config/master.key  20cc3b0a2108ccc75874833530a741f5b82c89c27a8876cda8933a5981cd8241  1714220255652
5820        /home/user/Projects/r7_app  main        chunks      /home/user/Projects/r7_app/tmp/ignore.txt     1475d3ed5223ede0fcb823689c5b83ca154066b72d5a837a1e1b3109ef1d1b6f  1714221543385
5821        /home/user/Projects/r7_app  main        vectordb::  /home/user/Projects/r7_app/tmp/ignore.txt     1475d3ed5223ede0fcb823689c5b83ca154066b72d5a837a1e1b3109ef1d1b6f  1714221543401
5822        /home/user/Projects/r7_app  main        sqliteFts   /home/user/Projects/r7_app/tmp/ignore.txt     1475d3ed5223ede0fcb823689c5b83ca154066b72d5a837a1e1b3109ef1d1b6f  1714221545311
5823        /home/user/Projects/r7_app  main        codeSnippe  /home/user/Projects/r7_app/tmp/ignore.txt     1475d3ed5223ede0fcb823689c5b83ca154066b72d5a837a1e1b3109ef1d1b6f  1714221545550

Log output

No response

sestinj commented 2 months ago

@romiras is this .gitignore in the root of your opened VS Code workspace? I just want to make sure I have all of the structural details of your folders correct so that I can test this myself. My first guess at what's happening is that we aren't correctly handling the leading '/' in .gitignore

romiras commented 2 months ago

@sestinj Ruby on Rails (RoR) application has been generated by rails new r7_app --api -J -T -A --skip-hotwire just to let me and others reproduce the issue. RoR was taken just for sake of example. You can generate skeleton app for Django app or whatever else.

Output of command "tree" in root of project ``` . ├── app │ ├── channels │ │ └── application_cable │ │ ├── channel.rb │ │ └── connection.rb │ ├── controllers │ │ ├── application_controller.rb │ │ └── concerns │ ├── jobs │ │ └── application_job.rb │ ├── mailers │ │ └── application_mailer.rb │ ├── models │ │ ├── application_record.rb │ │ └── concerns │ └── views │ └── layouts │ ├── mailer.html.erb │ └── mailer.text.erb ├── bin │ ├── bundle │ ├── rails │ ├── rake │ └── setup ├── config │ ├── application.rb │ ├── boot.rb │ ├── cable.yml │ ├── credentials.yml.enc │ ├── database.yml │ ├── environment.rb │ ├── environments │ │ ├── development.rb │ │ ├── production.rb │ │ └── test.rb │ ├── initializers │ │ ├── cors.rb │ │ ├── filter_parameter_logging.rb │ │ └── inflections.rb │ ├── locales │ │ └── en.yml │ ├── master.key │ ├── puma.rb │ ├── routes.rb │ └── storage.yml ├── config.ru ├── db │ ├── development.sqlite3 │ ├── schema.rb │ ├── seeds.rb │ └── test.sqlite3 ├── Gemfile ├── Gemfile.lock ├── lib │ └── tasks ├── log │ └── development.log ├── public │ └── robots.txt ├── Rakefile ├── README.md ├── storage ├── tmp │ ├── cache │ │ └── bootsnap │ │ ├── compile-cache-iseq │ │ │ ├── 00 │ │ │ │ ├── 0324cd82370db3 │ │ │ │ └── 2bea6d7542e7b1 │ │ │ ├── 01 │ │ │ │ ├── 87a7d114483147 │ │ │ │ ├── baa583cbd8a735 │ │ │ │ └── d3f77b581741b6 ... (with many other temporary directories and files) │ │ │ ├── fe │ │ │ │ ├── 2381ed85da335a │ │ │ │ ├── 5a99b7a28da7d1 │ │ │ │ └── 6acadc23be5f36 │ │ │ └── ff │ │ │ ├── 32b852982880e8 │ │ │ ├── d8ce61cef93021 │ │ │ ├── dd9911203bbb8a │ │ │ └── e84f0efbda8321 │ │ └── load-path-cache │ ├── development_secret.txt │ ├── ignore.txt │ ├── pids │ └── storage └── vendor 285 directories, 1467 files ```

File .gitignore also located in root of project.

As result of indexing we can see all tmp, secret file, ... everything that we don't expect... leaked into ~/.continue/index/index.sqlite.

slyt commented 1 month ago

I think continue.dev is slowing down my machine when I open VSCode due to it indexing my 20,000 line daily_journal.md despite a .gitignore with *.md existing in the same directory.

dchansen commented 1 month ago

Can confirm. Using WSL or dev container, Continue will try to index my build folder, even though it is in the .gitignore. This is a particular issue, as the c++ package manager we are using copies the source of all the dependencies into the build folder, so Continue attempts to index all of Boost.

savely-krasovsky commented 2 weeks ago

In my case Continue doesn't respect both .gitignore and .continueignore at all, I am using JetBrains IDE in Windows.

fazo96 commented 2 weeks ago

Similar issue here, actually worse than what other people have reported. My project is multi-root workspace, with .gitignore in each root (multiple folders entries in the .code-workspace file)

This code-workspace file is opened in a devcontainer.

What happens is that the VSCode Continue.dev extension will start indexing, which takes a very long time. A few seconds later, VSCode will freeze and get stuck on "Reconnecting to devcontainer..." making it impossible to do any work.

A few seconds later the Continue.dev indexing will get stuck, so I can't just wait for it to complete.

I tried adding a .continueignore file at the file system root of the project, but didn't help.

Disabling the Continue extension will stop the problem.

fazo96 commented 1 week ago

I was able to mitigate the problem somewhat by adding a separate .continueignore to each folder of my VSCode workspace even though 2 of them are subfolders of the root.

However, this still doesn't let me use the extension because it is never able to finish indexing (see #1467) and eventually leads to VSCode or the extension crashing/freezing.

sestinj commented 1 week ago

Thanks everyone for adding details here. I'm going to do work on this problem this week, as it definitely seems pressing. Until then you can set "disableIndexing": true in your config.json to avoid any critical errors

pbrit commented 7 hours ago

The same issue for me as well. My workspace has a few conda environments, but they are indeed added to .gitignore. One thing I've noticed was the extension was busy looping the editor, so everything was very sluggish.

Adding the aforementioned environments to .continueignore does fix the issue.

My setup: VS Code + Dev Container on Windows PC

sestinj commented 7 hours ago

@pbrit are you on the main release of the extension? I've just recently published a new pre-release version (0.9.177) that should correctly listen to all .gitignore patterns.

The only thing I can potentially think of if you're already on the pre-release: is there any chance you've opened a sub-folder in the repository at the root of your VS Code workspace, where the .gitignore is in the root of the repository, not in the VS Code workspace?