Auto-add of untracked files screws me up every time

durin42 commented 2 years ago

Description

Initially I thought the auto-add of files was a neat idea, but in practice I just leave untracked files in my repo all the time, and tools like patch(1) assume they can drop .orig or similar files in the WC without it being a problem. I think every time I've used jj I've ended up getting grumpy at auto-adds and having to rip something out of a commit, sometimes after doing a push (when it was effectively emulating hg import for example).

Steps to Reproduce the Problem

patch -p1 < some.diff (or similar)
jj describe && jj close && jj git push
Look at web view of diff, notice you committed a .orig file again (or similar)

Expected Behavior

I (still) don't expect auto-adds, and it really surprises me every time, plus it's super frustrating to have to ignore every file spec I might create as a temporary scratch file or what have you.

Actual Behavior

As above.

Specifications

Platform: macOS
Version: 6c83eb6ae3c56f1912a630dca4751c7f723a81a3

arxanas commented 2 years ago

Agreed on that — I think automatically committing tracked files is fine, but untracked is probably bad:

They could be very big
They could contain secrets
Querying the working copy for only tracked files is probably more efficient in practice (like with git status -uno)

martinvonz commented 2 years ago

I mostly find the feature useful, but I also agree that it can be annoying and confusing. The worst case I've noticed is when you - perhaps accidentally - check out the root commit, where there's no .gitignore file containing target/. Then any command you run will try to commit GBs of data. I ran into that again just a few days ago, and was actually thinking of adding a config for the auto-add behavior.

When looking at an old version of the repo, you'll not see untracked files (e.g. jj --at-op=<some old operation> status), but that seems fine. I think the most annoying bit will be to add a way of providing that information to users who want to commit the working copy in the background (like we will probably do at Google). I'll probably skip that bit to start with when I implement this.

tp-woven commented 1 year ago

Just to throw my 2c in: With the exception of new files that are also added to .gitignore, I actually really like the automatic tracking. I also think the same "repro" from the first comment can be used as a reason to auto-track - if you don't jj st before pushing, you're just as likely to miss a file that you should have added as you are a file that you should have ignored. So my preference is that this would be configurable if possible, rather than just removed.

elasticdog commented 1 year ago

Just wanted to say that I really like the auto-tracking feature as well. After using jj for a while, going back to manually having to think about adding files to be tracked seems like a lot of extra work and possibly more error prone. I'm already used to checking jj status to make sure that I'm committing the expected files...that said, I can definitely see how not thinking about the appropriate ignores right away could be troublesome if the files are large, and also acknowledge the the root commit situation (although I don't know how often that would realistically come up in day-to-day usage).

necauqua commented 1 year ago

The thing with accidentally committing GBs of target when you forgot to ignore it or checkout some old commit is that there's no GC at the moment, so those GBs are forever in the history.

Especially if you have something automatically creating it, like direnv, the moment you check out the commit before your update of gitignore, literally happened to me with jj repo more than once.

The only way is to recreate the repo (losing the oplog) this time not forgetting to ignore stuff/not editing old commits - which is kind of meh as well.

Full GC of course means basically the same thing, only automated by a single command, but something like jj undo --harder [--at-op OP] [--and-immediate-gc-too] (are you sure [y/N]) to delete the last op/OP unrecoverably and gc things it referenced could be cool.

ony commented 1 year ago

Maybe it is possible to track ignores in jj too. If you switch away from commit at which you had something ignored but not at new one - that information probably can be used. E.g. sort of in-flight ignores that visible in jj st to inform about either extending explicit ignores or confirming addition of new paths to current change.

P.S. Can think of how git checkout reacts when your untracked file is about to be overwritten by checked out files. In case of jj all files are tracked by default and thus even absence of the file is tracked.

necauqua commented 1 year ago

Hm, so auto-tracking when you are doing changes in a working commit is the main killer feature.

But for my issue with it, how about this: What if jj considered .gitignore not just from the current commit, but from the entire history - or actually just descendant of the current commit? And if you needed to actually add some file in the past you'd have to explicitly track it (instead of explicitly untracking it in the common case you don't need that).

I know this is kind of magical, but the more I think about it the more it makes sense, idk - and the autorebase is magical on it's own, this somehow is kind of even consistent in my head

martinvonz commented 1 year ago

It would be expensive to find the gitignores from all commits, but we could probably index that information. I'm more concerned that it would be unexpected behavior. For example, if you check out a sibling commit where target/ is not ignored, then it would still not be ignored if we only consider descendants.

Maybe it's better to check if the gitignores changed between the old and the new commit and if any untracked files according to the new patterns match the old patterns. If that happen, we could just print a warning about it. We could additionally add the ignores to a per-workspace set of ignores (which we don't support yet). (EDIT: I think this is what @ony suggested.)

necauqua commented 1 year ago

If that happen, we could just print a warning about it

The warning being "those differences are implicitly untracked for this WC, in case your direnv caused GBs of files to generate in target and/or .direnv - add them to gitignore here or explicitly track them, moving to another commit without them ignored (e.g. jj new) will cause them to be tracked in that commit"

^ this is a loose idea, could be refined, for example jj st will have that information ofc

ilyagr commented 1 year ago

Following @ony's suggestion, perhaps when the working copy moves to a new commit, we could track "newly unignored" files by comparing the old .gitignore and the new .gitignore. For example, we could store an extra tree of "newly unignored" files.

Then, the UI could provide ways for dealing with these files. E.g. there could be a command like jj ignored --previously that lists them and jj ignored --previously --restore that gets rid of them. We'd also have to decide whether a modification to a "previously ignored" file makes it no longer "previously ignored".

kevincliao commented 8 months ago

I ran into this today when trying to checkout to a different branch that doesn't contain node_modules. As people have mentioned above at that point it's not possible to run any jj commands. I ended up creating a temporary .gitignore file before being able to jj op restore to a previous checkpoint. Is there a better way to recover when running into this? I wonder if it's possible to have jj op commands still work in this scenario.

kevincliao commented 8 months ago

Ahh ignore my comment, I think I got confused - once there is an option to not auto-add untrack files jj op commands will work again.

yuja commented 8 months ago

Is there a better way to recover when running into this? I wonder if it's possible to have jj op commands still work in this scenario.

You can pass --ignore-working-copy to these commands, but we don't have the last bit to reset the working copy without snapshotting yet.

jj op log --ignore-working-copy  # "jj op log" also works with the current main branch
jj op restore --ignore-working-copy @-
jj workspace update-stale --some-option-to-not-snapshot-before-resetting

HadrienG2 commented 8 months ago

Overall, it seems there is no good automated way to handle untracked files when creating commits. One one hand, not tracking them leads to incomplete commits. On the other hand, auto-tracking them leads to commiting of unwanted files. So how would you feel about some variation of the following semi-automated design?

By default, interactively prompt before auto-adding files, something like This command will add <list of files> to the current commit, proceed? [y/N].
- Saying yes follows the current behavior.
- Saying no aborts the command with an error return value and lets you use jj track and .gitignore as appropriate.
Have a way to whitelist sets of files (e.g. source files) so that they are auto-added without a prompt, and not mentioned in prompts when they do occur.
- This could take the form of a .jjadd file that uses the same glob syntax as .gitignore.
- The aforementioned prompt would mention the possibility of configuring jj for auto-adding and ignoring.

I think this might strike a good balance between the following concerns:

Files which we want to track (like source files) eventually get auto-added silently as desired, avoiding incomplete commits and replicating the good parts of the current jj auto-add UX.
Files which we do not want to track (like object files, target/ directories...) eventually get ignored silently as desired, without undesirable creation of commits that will keep them in the history forever.
After a short initial configuration period, seeing the prompt becomes an exceptional event and thus leads the user to pause and think, as desired in this situation.

For scripted operation, there should be a way to provide a default answer to the prompt via CLI arguments.

martinvonz commented 8 months ago

I'm personally quite happy with the current behavior (except for the behavior when updating to a commit with different .gitignore). It can be a bit annoying in the beginning, but once you've added the appropriate paths, I find that it works pretty well. Maybe others feel differently. But even if they don't, we may want to make it less annoying for new users by doing something like you suggest.

ilyagr commented 8 months ago

I'm not very happy with the idea of the interactive prompt.

I think that if you edit a .gitignore, any subsequent jj command could trigger this prompt, including jj log. I tend to run an analogue of watch jj log in a tmux pane permanently, and I think this would work very badly with the interactive prompt. Firstly, I'll need to adjust the command to use the "scripted mode". In the "scripted" mode, if the default answer to the prompt is "yes", this goes back to users experiencing auto-add of untracked files. If it's "no", jj's view of the workspace could be out of sync with reality for a while (but, if we go with a prompt, I think this is the better option).

Other UIs will also do an analogue of jj log regularly. Every jj UI (e.g. VS Code plugin) would probably need to have a way of giving this prompt to the user, if we made this interactive.

HadrienG2 commented 8 months ago

Ah, yes, there's that. I knew that this design decision of having status commands modify the repository was fishy and going to cause problems someday...

lf- commented 5 months ago

This feature has sadly made me bounce off of jj immediately every time I try it, which is really unfortunate because I keep hearing such good things about it, and want to give it a genuine try.

Every single repository I work on regularly has various testing/strace-log/whatever files in its root. I actually don't mind auto tracking in src/, but in the root it just is not compatible with my workflow.

fwiw a workaround for this that I've not yet checked works on jj might be some kind of terrible thing like so in the .git/info/exclude:

/*
!src/
!Cargo.*

This workaround is quite bad indeed, and I would rather not have to reimplement the git index in .git/info/exclude to be able to use jj, though admittedly it would be with wildcards at least.

ilyagr commented 5 months ago

I have no idea whether this would be helpful you, but here's something that helped me a lot. I can't remember who had suggested it originally; it might be in the FAQ.

I added _ignore/* to ~/.config/git/ignore (~/.gitignore should also work).

This is possibly not absolutely optimal (I have been wondering whether /_ignore/ would be better), but works well enough. I actually use _ilyaignore to make the name more unique.
Create an _ignore subdir in my repo
Save all weird logs and traces to it

lf- commented 5 months ago

Yup it is in the FAQ or something; I've seen it given as advice before. I just don't like it and it doesn't vibe with how I work, since it would be a whole bunch of extra typing. I could have it be i/ or something, I guess, to reduce typing, but I would still have to remember to do it every time, which feels kind of bad?

ilyagr commented 5 months ago

Inspired by more feedback (https://github.com/martinvonz/jj/discussions/3528#discussioncomment-9148691), perhaps @dpc 's suggestion from that post might work. Perhaps we could have a notion of "untracked" files, like Git, and default files to "untracked", while also auto-updating all the tracked files on each command?

jj status would certainly show any untracked files. jj log could too. It's not quite in the spirit of "everything is a commit" (untracked files would show up as a fake commit in some places), but might work.

One question is what jj diff would do. My first instinct would be to have it act on tracked files only, but complain loudly when there are untracked files. Perhaps each command would do that, I'm unsure.

This would be a huge change, so I almost certainly missed some important considerations.

dpc commented 5 months ago

This would be a huge change

Speaking out of ignorance, I'm guessing jj already needs to compare all worktree files against .gitignore. Right after that it could just compare them against files already tracked in the current change and ignore ones that are not. Plus a command to track a file. And that's kind of it, no? Showing untracked files, etc. seems like a nice-to-have. Deleting a tracked file could work as a "untrack", just like it already does.

Hmm... I guess mv <sometrackedfile> <newlocation> now requires explicit calling "track" on a new location, which is a bit breaking the "immersion", but I think it's fine. And again - this behavior would be optional (but I would suggest making it the default for the sake of newcomers). People that figured out everything could just opt-in into current seamless behavior, which I find elegant and I'm sure I would eventually settle into it just fine, after making sure given repo doesn't produce untracked trash, adding some ./tmp/ to .gitignore and remembering to create my debugging stuff inside it.

martinvonz commented 5 months ago

jj status would certainly show any untracked files. jj log could too. It's not quite in the spirit of "everything is a commit" (untracked files would show up as a fake commit in some places), but might work.

One question is what jj diff would do. My first instinct would be to have it act on tracked files only, but complain loudly when there are untracked files. Perhaps each command would do that, I'm unsure.

If we add support for untracked files, I think it should be pretty much only jj status that shows them. They would just be invisible to every other command. Would that work for the untracked-files proponents?

I guess mv <sometrackedfile> <newlocation> now requires explicit calling "track" on a new location

I don't think so. Almost all commands, and probably also the future jj mv work on commits and just update the working copy to match afterwards.

dpc commented 5 months ago

Would that work for the untracked-files proponents?

:+1: The whole point of not tracking them is so that they don't influence anything.

Almost all commands, and probably also the future jj mv work on commits and just update the working copy to match afterwards.

Correct me if I'm wrong, but currently mv s d would automatically have changes to s and d reflected, possibly even as a rename operation of some kind. I wasn't even aware of jj move which actually seems to exist. Yes jj move would automatically track the destination path in both "explicit" and "seamless" (current) mode. Only "raw" mv while in "explicit" mode would appear to jj as source file being deleted (and thus untracked), and destination file showing up as a new (untracked) file. Which I think is perfectly fine in practice.

martinvonz commented 5 months ago

Correct me if I'm wrong, but currently mv s d would automatically have changes to s and d reflected, possibly even as a rename operation of some kind. I wasn't even aware of jj move which actually seems to exist.

Ah, I just thought you meant a hypothetical jj mv when you said mv. Never mind then :)

durin42 commented 5 months ago

If we add support for untracked files, I think it should be pretty much only jj status that shows them. They would just be invisible to every other command. Would that work for the untracked-files proponents?

This actually wouldn't really help me, because my most common use case is for a config file that I want to have sit in the working copy (or a bisection script for git bisect run or similar, I guess) and not have it disappear when the working copy moves around (but also never be added to history.) And typically I don't control the .gitignore because it's someone else's project. :)

lf- commented 5 months ago

I think your use case would work in the git-like untracked files proposal? they're just left alone by every command as if ignored, which would match git behaviour where this does work.

(but if you do need to actually ignore them, .git/info/exclude)

necauqua commented 5 months ago

Inspired by more feedback [..]

I'm fairly certain I talked about having tracked files like this at some point somewhere, eh But yep I'm all for it, I'm still carefully looking at things like vscode git index thing (or actually my old gst -> git status alias) before running jj commands, or doing --ignore-working-copy ¯\_(ツ)_/¯ Perhaps a track-all config?

jj move which actually seems to exist

fyi this one moves files between commits, not like mv \:)

martinvonz commented 5 months ago

I don't think it will be a priority for me to work on this any time soon, but I won't object if anyone else sends a PR for this. I haven't thought of any problems with it.

maan2003 commented 5 months ago

maybe it should depend on file extension, I wouldn't mind auto tracking .rs or .toml or *.sh files

martinvonz commented 5 months ago

maybe it should depend on file extension, I wouldn't mind auto tracking .rs or .toml or *.sh files

I've also been thinking that we could have some way of configuring that. Either a regular config or a file tracked that can be tracked in the repo, like a .gitignore file but instead .jjtrack or something. I'm not sure how much the patterns are going to be about personal preference project or vary from project to project.

dpc commented 5 months ago

Heuristics like this are complications in their own way. I would first implement manual tracking and then worry if something "in the middle" is even needed.

PhilipMetzger commented 5 months ago

Inspired by more feedback [..]

I'm fairly certain I talked about having tracked files like this at some point somewhere, eh But yep I'm all for it, I'm still carefully looking at things like vscode git index thing (or actually my old gst -> git status alias)

Yeah, I remember that, I called it the second hell of Git[^1]. I don't mind proceeding with this solution as long as we advertise it as transitional with no guarantees to not set to high user expectations (see Hyrum's Law) and lower maintenance effort.

And if that is not enough and we require more transitional features to behave like a Git client, I'd put them under a single config flag with the same guarantees. Not that I have a particular interest in that.

[^1]: See the following messages for my unchanged opinion on that.

marc-h38 commented 2 months ago

Perhaps we could have a notion of "untracked" files, like Git, and default files to "untracked", while also auto-updating all the tracked files on each command? [...] This would be a huge change, so I almost certainly missed some important considerations.

I don't think it will be a priority for me to work on this any time soon, but I won't object if anyone else sends a PR for this. I haven't thought of any problems with it.

Speaking out of ignorance, I'm guessing jj already needs to compare all worktree files against .gitignore.

How about a simpler, new echo '*' >> .git/info/jjexclude file in the shorter term? Would that be a smaller change?

Does jj reads the various gitignore files itself, or does it blindly rely on some libgit for that? In the latter case, could libgit be told to consider .git/info/jjexclude when and only when used by jj?

I just spent a few hours learning about jj and I was blown away. So smart! So I was on the verge of trying it. Then I noticed this auto-add and that was quite a setback :-(

I'm so glad I found this issue! For a short time I wondered whether I was the only weirdo in the universe (ok: in the jj universe....) having temporary files in his git checkouts... The current "scratchpad" entries in the FAQ reinforced that impression. Now I feel less weird. Maybe the FAQ should link to this issue? BTW "scratchpad" sounds like a pretty narrow view of temporary files. I didn't find the _ignored/ trick in the FAQ.

I understand that we all use different tools, solve different problems and have different ways to work. Apparently, most jj users are very lucky to have a small and controlled set of temporary files and for them, never forgetting to commit a new file must be great! But that's definitely NOT everyone: I very, very rarely create new files and I have tons of temporary files all the time. So for me the extremely minor convenience of not forgetting to track one new file completely dwarfs the HUGE inconvenience of auto adding temporary files. So I'd love auto adding to be somehow git configurable in the longer term. With some git/info/jjexclude in the shorter term?

Besides being the local git guru, I debug build systems a lot and I very regularly juggle with a lot of large build directories. Sure, I could make the effort of calling them all "build_A/", "build_B/", but that's extra typing and cognitive load not to forget to follow that convention. I much prefer to dedicate my typing energy to "mv builddir some_useful_name", "mv builddir some_other_useful_name", etc.

I also have a lot of various (build) logs and other traces on a very regular basis. Some _ignored/ naming convention would be even more painful here.

Finally, there's the (git) bisect problem! Very frequently, I want to compare the output/traces of different commits. How could I do that if jj automatically removes my temporary files every time I switch to a different commit?

The various performance issues are pretty obvious and have also been mentioned above already. jj's eagerness to track all temporary files is actually surprising considering its focus on performance. Also: secrets. Also: someone decides to rename "buildDir/" to "build_dir/" (and adjusts .gitignore accordingly). What happens when you move back and forth across that commit? Based on the previous comments, this seems just intractable.

From what I understood, the best workaround currently is something like echo '*' >> .git/info/exclude but then I lose the ability to display temporary files with a simple git status. Then manually comment out that line in the rare cases when I need to add a new file? Or just use bare git to add it?

marc-h38 commented 2 months ago

Apparently, most jj users are very lucky to have a small and controlled set of temporary files and for them, never forgetting to commit a new file must be great!

In my couple decades experience, forgetting to add a new file seemed extremely rare and more importantly it only slowed down the submitter, never the rest of the team. I think it's a combination of new files being rare in the first place and obvious build failures catching forgotten new files so quickly that most of the team doesn't even have time to notice the issue.

On the other hand, I see people submitting spurious changes at least once a month. Maybe not at Google, but there is a significant number of people who don't self-review their own changes before submitting them or submit changes so large that they can't thoroughly review them. git commit -a and 5pm push!

From a team and CI perspective, a forgotten new file is either caught by CI very quickly, or part of something brand new and not tested yet that no one really cares about yet. On the other hand, spurious changes are much harder to catch because they usually "pass": forgotten log statements, unrelated feature, forgotten comments, subtle concurrency experiment, line commented out for "temporary" testing purposes...

I guess all I'm saying is: jj's new, fascinating "auto-add" approach seems simply incompatible with sloppy code reviews.

dpc commented 2 months ago

So much what @marc-h38 said that I'm going to write a needless comment to underline it.

kankri commented 2 months ago

I was really interested to try out Jujutsu a couple of years ago when I learned about its fresh ideas, but when I learned about the auto-add feature I was first perplexed (surely this can't mean that huge binary files and all my secrets get consumed by jj!?) and then disappointed. I have kept coming to jj over the years, this time after seeing the cool gg GUI, only to find out this sticking point for me still exits.

I think I have forgotten to add a file only a handful of times during my career, and every time it was noticed before the commit was merged, most often before I even pushed my changes from my local repository. Clearly this can't be the driving motivation for the auto-add feature. But what is it then?

To me, forgetting to commit a file is harmless: it is usually quickly noticed and easily recovered. Accidentally committing a file can be deeply harmful: locally it can cause performance issues (big binary files or deep test file directories etc.), but accidentally slipping a file with credentials in a commit can be difficult to notice and cost a lot of effort to remove afterwards (invalidating leaked credentials and verifying they were not used maliciously, creating new ones).

It also seems to me that controlling the limited set of files contained in a project is easier than trying to prevent auto-add feature to add unwanted files from the unlimited set of files.

I use ignore files in the project to hide known untracked files the build flow generates (e.g. *.orig or *.pyc) and personal ignore files to hide untracked files generated by the tools I personally use. However, I intentionally don't ignore many of the untracked files I create in a project directory (notes, credentials, ...). Some files are temporary, so I don't bother ignoring them. Some files are long lasting. Seeing them as untracked and unignored highlights them and give me a starting point when I come back to a project after a while. So at least for me the goal is not to have the status output completely empty by having files as tracked by the VCS or ignored by manually adding ignore rules. If jj wouldn't just auto-add my files, I would be happy!

So basically I'm completely agreeing with @marc-h38 but just wasted your time by rephrasing it from a slightly different POV.

emilazy commented 2 months ago

Clearly this can't be the driving motivation for the auto-add feature. But what is it then?

Without expressing a definitive opinion one way or the other: the advantage of the current behaviour is that the @ commit is the working copy, minus explicitly excluded files. That means that, e.g. jj status telling you that new files exist works because it looks at the @ commit. It makes the model a lot simpler and more orthogonal. It also gets you nice functionality like new files getting automatically stashed when you move to another commit. Adding a manual file tracking step starts to reintroduce the idea of a staging area that Jujutsu so wonderfully subsumes into its much more coherent model.

I understand why it’s rough for people, though. To quote myself on the Discord:

auto-add is definitely the aspect of Jujutsu I struggle with the most – it's clearly pretty crucial to the whole idea of working-copy-as-commit, stashing "for free", etc., but at the same time I do absolutely have to spend more time checking that stuff didn't sneak in to commits, manually .git/info/exclude-ignoring stuff I need for my development environments in third party projects (.envrc etc.), and so on

you're using rust-analyzer and rustc ICE'd without you noticing? you might end up pushing a rustc-ice-*.txt file…

FWIW though, I really wanted that behaviour to not exist and really wanted to push for some kind of opt-in manual-add system when I first started playing around with Jujutsu many months ago, and I'm considerably more sanguine about it now, so perhaps it largely is just an adjustment period

I think setting up more global ignores might make me happier. I'm trying out adding /.envrc, /.direnv, /flake.nix currently – it'll be annoying when I actually want to commit those files and I'll probably forget to re-include them in .git/info/exclude at some point, but at least I can set up my development environment in third-party projects without fuss

ideally I would try and make as much stuff in my workflow not put files in-tree as possible

although I've resisted having wrapper directories for all my repositories for years

[…]

for build tooling it sucks [to have things write randomly to the working copy directory]. for stuff like per-project editor or development environment configuration, I'm not so sure

stuff that's not a generated output, that's clearly coupled to the project, but that is not necessarily tracked upstream (but still can be)

I wouldn’t be totally opposed to an experimental opt‐in manual tracking mode, but I worry about a long‐term bifurcation of the ecosystem. I realize this is an inherently biased sample, but I wonder how many people there are that have intensively used Jujutsu for months and still really hate the auto‐add behaviour, given my own background in feeling that way at first and gradually warming to merely having mixed feelings about it.

marc-h38 commented 2 months ago

Without expressing a definitive opinion one way or the other: the advantage of the current behaviour is that the @ commit is the working copy, minus explicitly excluded files.

Sure, but there's a HUGE difference between auto-adding changes to already tracked files versus auto-adding new and temporary files. Can we "meet" somewhere in that middle? Thanks to some new git config, some new jjexclude, other,...

I don't know anything about jjs implementation but it seems to already have some sort of .gitignore capability and concept of "untracked" files. Hopefully that can just be leveraged/extended somehow? I mean in some jj-specific fashion; not just with some gitignore. BTW I'm curious what happens in case of a "non-merge conflict" with a file in .gitignore but tracked in a different commit...

It also gets you nice functionality like new files getting automatically stashed when you move to another commit.

That's exactly what you do not want with temporary files, log files, build directories, etc. For me these temporary files are many orders of magnitude more frequent than new tracked files.

Adding a manual file tracking step starts to reintroduce the idea of a staging area that Jujutsu so wonderfully subsumes into its much more coherent model.

I really don't see how this would re-introduce the idea of a staging area. jj add would just add the new file to the current commit, done. It becomes tracked and that's the end of the story.

steveklabnik commented 2 months ago

I still love auto-add, but I recently had a conversation with someone where it was a showstopper for them. Three cases:

they like to keep personal notes in the repo, but not committed, and find the need to remember to add exclusions annoying
some projects leave output files all over the tree, and if they're large projects, it can make this difficult to even get a .gitignore right
debug logs or core dumps often end up in random places

I personally would think that number 2 is something that should like, be fixed, but that doesn't mean that the reality is that if an upstream project doesn't care about being clean here, it's gonna not be a good experience to use jj on, regardless of if it's that project's "fault" or not.

scott2000 commented 2 months ago

If the default is changed to untracked, the main concern I have is that I might not notice that there are untracked files. So if there are any untracked files, I think it would be good to print a warning before every command recommending that the user either run jj track or add it to the .gitignore file. I use jj log way more often than jj st, so I feel like I might miss untracked files if it doesn't cause a warning to be printed. For instance, a session might look like:

$ jj diff

$ echo "AAA" > a.txt

$ jj diff
Warning: You have untracked files in your working copy. If you want to ignore
these files, add them to a .gitignore file. Otherwise, run `jj track` to start
tracking changes to these files.

$ jj track
Started tracking 1 new file.

$ jj diff
Added regular file a.txt:
        1: AAA

Perhaps an issue with this design is that if you add a file and then do a command like jj commit, you might see the warning too late (after you already finished committing). I do like @HadrienG2's idea of being able to choose a set of files to automatically track (but maybe as a config option rather than a file though). So you could for instance specify glob:"**/*.rs" as auto-tracked, so then usually you wouldn't have to manually track files while writing Rust code. If someone wanted the current behavior, they could also just specify the root directory to auto-track all files.

That being said, the current behavior wasn't very hard to get used to, and I do usually like not having to think about whether files are tracked or not. It's only really bothered me a few times, and usually it's at the start of working on a project when I haven't set up the .gitignore properly yet. But the risk of accidentally committing a large directory with many small files like node_modules definitely scares me. Perhaps it would be sufficient to make the file size limit also apply to the total size of untracked directories?

gulbanana commented 2 months ago

When using Git, I thought of the ability to manage the index manually as a waste of time. I was used to gif add -A and IDEs which stage everything by default, so I already picked up the habit of .gitignoring everything that shouldn’t be tracked.

From that point of view jj’s simpler system felt like a straight upgrade. Of course, it would be nice to support more workflows if it doesn’t add back some kind of inescapable complexity.

marc-h38 commented 2 months ago

If the default is changed to untracked,

I have not seen anyone requesting the default to be changed. Please don't spread that sort of "fear"?

I think it would be good to print a warning before every command recommending that the user either run jj track or add it to the .gitignore file.

That would be overkill if the default is not changed. If you do ask for explicit tracking of new files then showing untracked files in jj status would be plenty enough. This is what git and every other version control system has done since forever.

I use jj log way more often than jj st, so I feel like I might miss untracked files if it doesn't cause a warning to be printed

Or, you should just keep using the default :-)

Besides my first surprise at how rarely other people seem to use temporary files, I'm also very surprised by this fear of forgetting new files. 1. How often do people create new files? 2. What are the so scary consequences? We've described in great detail the many, concrete problems that happen when auto-adding temporary files. I haven't seen what serious problem it is to forget to add a new file besides the mere inconvenience.

it would be nice to support more workflows if it doesn’t add back some kind of inescapable complexity.

I hope this would be as "simple" as automatically treating all new and temporary files like they are part of .gitignore. Except in jj status where the would be shown like git status does but that's hopefully just a small display difference. So the logic would not change too much? Some error messages should be different maybe.

emilazy commented 2 months ago

For what it’s worth, none of my concerns about this matter are related to fears of forgetting to add files. I recently learned that tree snapshotting does actually already depend on the contents of the previous tree, though, which alleviates my concern about having to either add additional state dependencies to the model or reintroduce a concept similar to the staging area. (But I have to admit I do kind of wish that status quo wasn’t the case.)

Anyway, as I said I’m not opposed to experimenting with this. But I would encourage people who see it as a deal‐breaker to try Jujutsu as it is for a while, because it does seem like people generally adjust to it more than they’d expect. I just generally fear that if we change the model every time people can’t do the exact same things they were doing with Git, Jujutsu will end up with a UX almost as complex and confusing as Git’s, and tie itself down to just being a nicer Git client.

joyously commented 2 months ago

Reading through this, it occurs to me that some people treat their repository as a development playground and some treat it as a vault. It seems like jj is designed to be more on the vault end of the range, but not quite all the way. I think that if you have tool outputs and saved logs and notes and temporary files, don't mix that with your repository.

marc-h38 commented 2 months ago

But I would encourage people who see it as a deal‐breaker to try Jujutsu as it is for a while, because it does seem like people generally adjust to it more than they’d expect.

OK but "adjust" how precisely? What are the clear, documented recommendations for dealing with temporary files? Is it this one? https://martinvonz.github.io/jj/prerelease/FAQ/#how-can-i-keep-my-scratch-files-in-the-repository

The solution currently described in that FAQ entry has a number of technical issues that have already been described at great length above (so: not repeating). It also does not use the word "temporary" files but the unusual and narrrower "scratchpad" term instead. It does not mention .git/info/exclude which was mentioned above and seems better in a few ways. It does not mention the ignored/ subdirectory option which seems to provide a less constraining naming convention...

My impression from this discussion is that everyone who easily "adjusted" barely uses temporary files; just one build directory + a couple file extensions maybe and that's all. Please prove me wrong and let's fix the documentation? I'm never shy to submit a documentation update myself but in this case I still don't know what are the best known methods to deal with temporary files. And sorry but I'm not interested in spending time trying jj if none of these methods proves to be realistic in the end; catch-22!

I just generally fear that if we change the model every time people can’t do the exact same things they were doing with Git, Jujutsu will end up with a UX almost as complex and confusing as Git’s, and tie itself down to just being a nicer Git client.

Temporary files have been "supported" (I never thought I would write this) by every version control system since forever. There is nothing specific to git here (and nothing related to the git index).

Has the jujutsu model already been changed to get closer to git or some other system and has that degraded the experience?

it occurs to me that some people treat their repository as a development playground

Is it so surprising for version control to be related to "development"?

I think that if you have tool outputs and saved logs and notes and temporary files, don't mix that with your repository.

So yes: one of the (too many?) solutions is to never forget to prefix everything with ../. Pretty inconvenient IMHO and likely incompatible with many tools (write access to .. is not guaranteed) but maybe one of the non-mutually exclusive alternatives worth mentioning in the documentation? That one would use very little documentation space.

DianaNites commented 2 months ago

Many development tools and build systems and even text editors dirty the directories they work in. The KDE Kate text editor, for example, when editing any file creates a temporary .name.kate-swp file. vim by default creates various temporary and swap files, too.

Visual Studio (not code) solution files have all kinds of files and nonsense, most of which you don't want in a repo.

Visual Studio Code has .vscode/, Rust has .cargo/ and target/, node_modules. patch like the issue OP

The norm for many tools is and has been that it is okay to "dirty" their directories, and that their use in source control repositories is safe.

Many developers have expected this as well, able to create untracked logs, scratch files or directories, copies of files to compare against without having to have two checkouts, or change one back and forth.

In the presence of file tracking, this is safe, its only annoying for git status to show untracked files you forgot to ignore or delete, or forget to git add a file after all.

Whereas without file tracking, like with jj, the expectation is its unsafe for tools and developers to dirty things even temporarily, because its dangerous to forget to delete/ignore/exclude a potentially sensitive or just large temporary files

theduke commented 1 month ago

I have been trying out jj over the last few days, and many of the ideas are awesome. I could probably put up with the other rough edges and switch to jj full-time, but the auto-tracking of new files will prevent me from doing so.

I second the comment from @marc-h38 .

I have random temporary files, log files and other junk in my repos all the time, and while auto-tracking of changes in existing files is great and what I want, automatic tracking o new files is very much not, and has already annoyed me a lot over just three days.

Adding new files manually is not a chore for me, but an intentional choice and safeguard against adding files that should not be in the history

The discussion here has been left idle for quite a while and no conclusion from project owners like @martinvonz .

Has the thinking evolved recently? Are there any plans to implement a manual mode for new files?

durin42 commented 1 month ago

I've talked to Martin some out-of-band on and off about this. I've actually mostly come to appreciate auto-add of files, but still find it vexing that there's no mechanism for user-local ignores (or a sticky "forget this file" mechanism.) I keep trying to figure out the "right" UX for that, but it's a tough nut to crack.

andyg0808 commented 1 month ago

I happen to like the auto-add behavior, but @durin42's comment above just jogged something in my memory: If you have a colocated Git repo, you can add * to .git/info/exclude (there is a user-local ignore mechanism). Then no files will be auto-added. You can still use git add -f filename followed by a git commit -m 'temporary' to create a commit containing the file. Once it's tracked, JJ will continue to pick up changes correctly.

Interestingly, it seems that jj edit on that temporary commit actually works, and JJ doesn't loose track of the file at that point.

I wonder if a good design around this whole problem would be to add a jj track which would be the inverse of jj untrack and which would force JJ to start tracking an otherwise-ignored file.

martinvonz / jj