jy-gh / RecentFiles

An Alfred workflow--and a command line utility--to easily find recently modified files.
MIT License
10 stars 0 forks source link

Permanent dross at top of results list #7

Closed NicholasSloan closed 1 year ago

NicholasSloan commented 1 year ago
Screenshot of Nisus Writer Pro (18-08-2023, 10-22-20)

Why do these our items appear permanently at the top of my RF results list? The fifth item is a genuine recent. When I invoke Alfred and type rf+space, I see a list with these four items and a lot more; after three seconds (is that normal?) all but the top four are replaced by the recents list.

My scope is set to a full work SSD (not the boot disk) and to "Files Only".

jy-gh commented 1 year ago

Nicholas,

My first thought is that Alfred is trying to be helpful, somehow, perhaps because of something set in the Preferences or maybe even another workflow.

The thing that makes me most curious is the diamond icon and the timestamp in the second line of the results. For example, Skim has the icon, then 07-28 10:30, a space, and then a heart icon. (It's almost as if those got marked as favorites or bookmarks, but I didn't know Alfred could do that.)

That's not something that the Recent Files workflow does--by default it puts the absolute path of found files in the subtitle of the result (so for the result "Clock regulation 2023.numbers" in your example, the title is "Clock regulation 2023.numbers" and the subtitle is "/Volumes/Work/Parish/Clock/Clock regulation 2023.numbers"). I've never seen that kind of icon/timestamp in Alfred results before, and I don't know what it signifies.

There are a number of things we can try to figure this out. We can start by temporarily disabling the Recent Files workflow like so:

  1. Invoke Alfred and use the "pref" keyword to open Alfred's preferences.
  2. Click on Workflows in the left sidebar.
  3. Right click on the Recent Files workflow and uncheck Enabled.

Now, if you invoke Alfred and type rf+space, like you have been, I would expect those four default results (Skim, etc.) to continue to display. To me that likely means that the Recent Files workflow isn't the cause of those items displaying.

The problem with this experiment, of course, is that even if Recent Files isn't the cause of this issue, we don't know what is, and that still leaves you with extraneous results.

Another thing to try would be to carefully examine your Alfred Preferences to see if something has been enabled that would cause this. I have looked through the preferences three or four times this morning trying to see if something jumped out at me, to no avail.

The following items looked like they had potential, and if you're up for it you could try temporarily unchecking those options if they're checked:

Features -> Universal Actions --> Action Ordering checkbox Advanced -> Learning --> Top Result Keyword Latching checkbox

The last idea I have would be to disable all of your workflows, except Recent Files, and see if the behavior persists. If it stops, I would enable one workflow at a time, invoke Recent Files, and see if the behavior returns. If it does, that last workflow is doing it somehow. I'm doubtful if this is the issue, as I don't know how that could possibly persist, but it's easy to do and worth a shot.

I'll ask on the Alfred Forum if anyone knows what puts those icons and the timestamp on the results and if I find anything helpful I'll let you know.

jy-gh

jy-gh commented 1 year ago

Nicholas,

I should also have suggested changing the keyword for Recent Files from 'rf' to something else to see if you get the exact same behavior or not.

You can do that using the Configure Workflow button for the workflow. Let me know if you need any help with that.

-jy-gh

NicholasSloan commented 1 year ago

Hi jy-gh, I did try to reply to the notification email but I think the reply address may not have worked. In short you pointed me in the right direction: I was trying to run RF and Recent Documents workflows (https://github.com/mpco/AlfredWorkflow-Recent-Documents) at the same time. It was the scripts in RD that was somehow adding the weird subtexts.

All good now —except that I am having a hard time making RF work with colon separated folder/volume names in the top level directory. Do spaces need to be replaced with %20? Even so, whenever I have more than one (colon-separated) path RF stops working. Many thanks for your help, Nick

jy-gh commented 1 year ago

Nick,

Hm. First, what version of RF are you using? The current one is v0.8.6, and it should work with spaces in directory names.

I did the following (from my $HOME directory) using a shell/terminal window:

mkdir 'Dir with spaces'
touch 'Dir with spaces'/foobar.txt
mkdir 'Another dir with spaces'
touch 'Another dir with spaces'/barfoo.txt
touch ~/Desktop/foobarbar.txt

Then, I put this in as my top level directory using the Workflow Configuration:

~/Dir with spaces:~/Desktop:~/Another dir with spaces

I was getting the expected files displayed.

If you're not, and you're using v0.8.6, can you send me the output that's displayed when you run RF using the debugger?

If you haven't done that before, just fire up the Alfred preferences, click on the Recent Files workflow, then click on the little bug icon on the bottom row. That will open a debug pane where Recent Files will write output, and that output should display an error message that may be quite helpful.

Let me know if I haven't been very clear on how to do that and I'll write up something with some screenshots to guide you through it.

Thanks! -jy-gh

NicholasSloan commented 1 year ago

Sorry again jy-gh, it seems to work ok with multiple directories now. No idea what changed. But as is the way of things, this leads to another problem. Because of the earlier issue I duplicated the workflow and used two versions with different keywords. One searches my work disk, is reasonably quick and works as expected.

The other searches, among other things, my main Applications folder. It takes an age to process (over 20 secs) and picks up a lot of dross because it seems to look into package contents and find a whole range of obscure files I am not interested in. I started trying to add these filetypes to the exclude list but ran into difficulties (should they be comma-separated or what?). A better approach would be an include list. Is this doable? That configuration window with light grey text in tiny fields is not very user-friendly.

One other thing, any reason why the command-modified action is browse in Alfred? I have added a Reveal action, which makes it simpler to grab unwanted filetypes, but it's a laborious process.

You may think I am asking too much of this workflow, but in my setup there is a lot going on in subfolders in my Apps folder that I need to keep track of, but it's mostly only text, rtf files and pdfs. An include-file approach would make things a lot easier.

Many thanks, Nick

Screenshot of Alfred Preferences (20-08-2023, 09-02-48)
jy-gh commented 1 year ago

Nick,

Well, I'm glad some progress has been made. Let me try and address each issue that you bring up:

  1. You've mentioned that you're searching your "main Applications folder". Do you mean that you're searching the /Applications folder?

Are you saving data files to this directory? That is, if you create a document using Word or Pages, or a spreadsheet using Excel or Numbers, are those files being saved to the system /Applications folder?

If that's what's going on--unless I've totally misunderstood things--I would strongly advise against that.

Mixing applications and user data files can cause all sorts of problems, such as the following:

a. Upgrading or reinstalling an application gets much more complicated, and can jeopardize your data. As an example, if you updated Word, it's completely possible that the upgrade process could delete any number of user-created documents in the /Applications/Microsoft Word.app folder structure. b. Backups--and restoring from backups--becomes much more difficult. c. This potentially creates a security vulnerability on your machine, as files with executable content (such as Microsoft Word macros) could be owned by a privileged user.

Ideally, you'd want to separate things you can reinstall (such as Word, Pages, etc.) and data to restore (such as documents/spreadsheets/presentations/etc. that you've created).

Really, the workflow is intended to ignore entire directory structures such as /Applications, and you're experiencing the reason for it.

Imagine that an include functionality were to be added to the workflow, and that I use it to search for .pdf and .txt files, only, in the /Applications directory. On my system, the /Applications folder contains 1,156 .pdf files and 1,152 .txt files from various programs that I have installed. If I also start storing .pdf and .txt files that I've created in that directory structure, every single search I make has to also search those 2,308 files that I am likely not interested in. This isn't counting the number of folders the program has to traverse in order to find files in it: my /Applications folder has 182,046 subfolders.

This is a staggering amount of extra work.

The way Recent Files is intended to work is to bypass folders such as /Applications entirely, which will save a large amount of time as it won't have to wade through "packages of obscure files" that you're not interested in.

  1. An ignore file is not a comma-separated list. It's a text file having one line per directory or file type to ignore. (It's not something I invented, it's used natively by the fd command that powers Recent Files, which is why it's being used here.) It's documented in https://github.com/jy-gh, in the Ignore Files section, but basically it's formatted like so:
Directory_I_want_to_ignore
*.file_extension_I_want_to_ignore

As an example, if you are intent on searching your entire hard drive, you could create the following Ignore file:

/Applications
/Library
/System
Icon?
~*
*.photoslibrary
*.musiclibrary
*.tvlibrary

The file has to be a plain text file, not a .rtf file or anything like that (use TextEdit or nano or something similar to create it), and each line of the file should have a separate entry for whatever item(s) you'd like to ignore. I can help you create an Ignore file if you need help.

  1. An include list is a feature that I could consider adding in the future. I'd have to do some thinking about how best to do it, but it should be possible. (I'm wondering if that causes difficulty with including both file and directory results in searches, but there's probably a way to handle that.)

Keep in mind, though, that even if that's added, if files are stored in a folder like /Applications, Recent Files will still be slower than it should be since it has to wade through tens of thousands of folders and check hundreds of thousands of files (I have 708,735 files in /Applications) in order to find the right files to search.

  1. The command-modified action of browse in Alfred is an oversight on my part. That's something I could address in an update.

  2. I agree that the configuration window text is smaller and harder to read than is desirable. That's an Alfred thing, not controlled by the workflow, and if there's a way to change it I don't know how.

-jy-gh

NicholasSloan commented 1 year ago

Hi jy-gh, Yes, /Applications. I have many third party apps stored in categorised subfolders of my Apps folder. I file Read Me files, manuals, and my own (mostly rtf) practice notes in the app subfolders. Only app-related data is stored here, not critical work.

I have never had a problem with upgrades (or daily backups). I don’t mess with app-created folders. Mostly I keep the apps themselves at the root level and alias to them in a subfolder. (And I don’t use Word or Pages.) So, I am prepared to believe that this may be bad practice, or at least unorthodox, but it does not seem to have caused me any problems in 25 years…except now ;-)

==

  1. Ok, I understand now why this setup makes life difficult for Recent Files/fd.
I had not fully appreciated that it would take so much computing effort to identify a limited number of file types and dates even within such a complex folder structure.
I’m still not clear however why you have to look into packages: that must increase the work massively. There are scenarios in which you might want to do this, but it seems to make sense not to by default.

  2. Thanks for explaining about the ignore file. I expect I’ll be able to suss that out.
Can I add packages to the list? Would fd still need to traverse the contents?

  3. I guess you can’t have both an exclude and an include list, or at least they have to be alternatives?

4) Command reveal is easily added. I have already done it for my copy. But thanks.

5) I did wonder if you were constrained by the Alfred GUI configuration. Understood.

Sorry to have proved such a troublesome user. At least I have a better understanding of the issues now. Recent Files is working great for my Work disk. I’ll probably stop trying to use it for Apps. Many thanks, Nick

jy-gh commented 1 year ago

Nick,

Yes, /Applications. I have many third party apps stored in categorised subfolders of my Apps folder. I file Read Me files, manuals, and my own (mostly rtf) practice notes in the app subfolders. Only app-related data is stored here, not critical work.

Ah, after I read your response the use case makes a lot more sense now. I had visions of vacation pictures, Christmas letters, and financial statements all being saved to the /Applications folder! My imagination clearly ran wild there.

Ok, I understand now why this setup makes life difficult for Recent Files/fd.
I had not fully appreciated that it would take so much computing effort to identify a limited number of file types and dates even within such a complex folder structure.
I’m still not clear however why you have to look into packages: that must increase the work massively. There are scenarios in which you might want to do this, but it seems to make sense not to by default.

You're right that it would add a lot of computing effort. I actually hadn't anticipated someone doing this and so hadn't contemplated handling it. I think one could ignore virtually any file if you can specify it in the Ignore File, but more on that later.

Thanks for explaining about the ignore file. I expect I’ll be able to suss that out.
Can I add packages to the list? Would fd still need to traverse the contents?

Yes, I would imagine that you could, as long as they have some predictable name or extension. If they're in the Ignore File, fd will skip them (it's really a fantastic utility).

I guess you can’t have both an exclude and an include list, or at least they have to be alternatives?

Here's where I have good news for you, I hope. I have just added v0.9.0, with support for a configuration item for extensions. You may put in a colon-separated string to specify file types that you want to target for results. Something like rtf:md:txt might work for you.

Command reveal is easily added. I have already done it for my copy. But thanks.

I thought it was such a good idea that I added it as well. Command will reveal in Finder, Control will bring up the Open with... Alfred action, and Option will open a Terminal window in the directory (or the parent directory if the target is a file).

I did wonder if you were constrained by the Alfred GUI configuration. Understood.

Yes, I have felt that it was a bit small to read as well. It would be nice, too, to have some additional tools for configuration. (As an example, it would be nice to have user-configurable Action associations, so that anyone could adjust the Command/Control/Option/etc. to action mappings according to their preferences.)

Sorry to have proved such a troublesome user. At least I have a better understanding of the issues now. Recent Files is working great for my Work disk. I’ll probably stop trying to use it for Apps.

You haven't been troublesome at all. You've given me a couple of great ideas, and those ideas made it into v0.9.0. (You've even got a shout-out in my README.md.)

My belief is that a combination of the Ignore File functionality and now being able to specify file/directory extension might be able to allow you to use Recent Files on your Applications folder. I hope you give it a try--and if you do, let me know how it performs.

Thanks!

-jy-gh

NicholasSloan commented 1 year ago

Thanks -jy-gh, great advances! Having listed a limited number of extensions to include and added an ignore list, I am now getting much snappier and cleaner results from my (very extensive) Apps folder

Just one question about the ignore file: in the example, what does the line <Icon?> signify? And does <~*> signify “everything in the Home folder”? Either way, my ignore file seems to be doing the trick.

Presumably the leading colon (default text in Filename Extensions box) is unnecessary?

And now a cheeky and off-topic comment: I think your icon is dreadful! I know, it’s a bit like coming into your home and criticising the furniture, I apologise but I am fussy about these things. Attached is a screenshot of the two iterations of Recent Files that I am running. I pinched the icon from another app, which is ok for me to do privately but not for you to do publicly—but it gives you the idea. If you like I could come up with something useable. All the best, Nick

Screenshot of Alfred Preferences (21-08-2023, 08-45-17)
jy-gh commented 1 year ago

Nick,

Thanks -jy-gh, great advances!

I'm very glad to hear that, thank you!

Having listed a limited number of extensions to include and added an ignore list, I am now getting much snappier and cleaner results from my (very extensive) Apps folder

Just one question about the ignore file: in the example, what does the line <Icon?> signify? And does <~*> signify “everything in the Home folder”? Either way, my ignore file seems to be doing the trick.

Ah, both of those are important. The characters "?" and "*" here come from globbing, a Unix concept for matching files using wildcard characters. (And macOS is definitely a Unix derivative.) A glob like "\~*" matches all files beginning with the tilde character--and the use of a tilde to represent a user's home directory is coincidental and unrelated. (I am reasonably sure globbing pre-dates the common adoption of "~" as a way to reference home directories, but at any rate they're slightly different concepts used in slightly different contexts.)

Many program--including Emacs, the legendary and highly influential text editor--use files beginning with the "~" character for temporary or backup files. That's why they're excluded from results.

In a glob, "?" matches a single character, so a glob like "?.???" matches all files with a four-character name including a three character extension, such as a.txt.

The Icon? pattern matches macOS's custom icon files. If you see one in the Finder, it displays as "Icon?", but in reality the "?" character is an invisible character, a carriage return character. (And there's a confusing subtlety here where the glob uses "?" to match a single character, and the Finder displays a "?", but this is likely coincidental; the Finder is just using the "?" to represent an invisible character, not referencing the globbing scheme used in Ignore files.) They're annoying to have in results, so that's why I filter them out. You can read more about them at https://superuser.com/questions/298785/icon-file-on-os-x-desktop

Presumably the leading colon (default text in Filename Extensions box) is unnecessary?

No, that's a necessary hack, and it requires a bit of explanation. The Recent Files workflow depends on the recent_files command line program I wrote (which in turn depends on the fd command). The recent_files command line program now accepts an option, -e or --extension, which allows one to specify one or more file/directory extensions to display (.pdf, .html, etc.).

So, at a terminal, this would be a valid command line invocation of the recent_files program: recent_files -e pdf --extension txt. The options are the -e and --extension parts, and both of those options have mandatory arguments themselves, which would be the "pdf" and "txt" parts of the command above.

The Recent Files workflow needs to pass the -e option along to the recent_files program, and that's great when the user has actually specified an extension. However, if the user doesn't, which would be the default usage, there's no way in Alfred's workflow configuration to conditionally not pass the -e option along to recent_files; that is, there's no way (that I know of, anyways) to omit the -e for the default case.

This is a problem, since the recent_files program expects the -e/--extension option to have an argument, and correctly (in all option processing libraries for all programming languages I'm familiar with) considers this an error if the option doesn't have an argument. So, I needed to pass a harmless argument along in all cases, and ":" is harmless and will be ignored. Sorry for the long-winded answer here.

And now a cheeky and off-topic comment: I think your icon is dreadful! I know, it’s a bit like coming into your home and criticising the furniture, I apologise but I am fussy about these things. Attached is a screenshot of the two iterations of Recent Files that I am running. I pinched the icon from another app, which is ok for me to do privately but not for you to do publicly—but it gives you the idea. If you like I could come up with something useable.

Cheeky indeed! But sadly true. I am not a designer or an artist. I draw deformed stick figures. I would love a custom icon made by someone who has some ability!

Thank you!

-jy-gh

NicholasSloan commented 1 year ago

Sorry, everything is working fine but there's something I still don't understand: In the first section of the example ignore list there is a series of directories, e.g. /Applications The significance of these is described as Directory_I_want_to_ignore My understanding would be, if I want RF to ignore a directory I add that directory to the head of the ignore list, yes?

But. My Primary RF workflow needs to search /Applications (with some file exclusions and inclusions configured). If I configure the top level directory as /Applications, but I don't add /Applications to the ignore list it does not run; If I add /Applications both as top level directory and in the ignore list…it runs fine and shows files in Applications. What am I missing?

I'll see if I can come up with an icon. Are you happy with the general clock concept or would you prefer something else? All the best, Nick

Primary_ignore_file.txt

jy-gh commented 1 year ago

Nick,

Sorry, everything is working fine but there's something I still don't understand:

In the first section of the example ignore list there is a series of directories, e.g. /Applications

The significance of these is described as Directory_I_want_to_ignore

My understanding would be, if I want RF to ignore a directory I add that directory to the head of the ignore list, yes?

But.

My Primary RF workflow needs to search /Applications (with some file exclusions and inclusions configured).

If I configure the top level directory as /Applications, but I don't add /Applications to the ignore list it does not run;

If I add /Applications both as top level directory and in the ignore list…it runs fine and shows files in Applications.

What am I missing?

Oh. You aren't missing anything. I unthinkingly kept using the example as an example instead of considering your use case here.

You absolutely wouldn't have /Applications in your ignore file. You might get a lot of value, however, from putting various subdirectories of /Applications into your ignore file, such as, for example, /Applications/Firefox.app or /Applications/Google\ Chrome (and note that you'd either enclose paths that include spaces inside single or double quotes or you'd use a backslash character "\" to escape the space as I have done with the space between "Google" and "Chrome").

I'll see if I can come up with an icon. Are you happy with the general clock concept or would you prefer something else?

I am sure I would be thrilled by anything you came up with!

Thanks!

-jy-gh

NicholasSloan commented 1 year ago

But the point is that whatever the logic, I did have /Applications in my ignore file (see previous attachment) and as top level directory, and RF still found recent files from /Applications. If I took it out of the ignore file RF refused to run. I put that in the past tense because for whatever reason I have removed it from the ignore file again and now it does still run. I probably left a typo in the previous ignore file.

This is only an academic question, but it's an anomaly that confounds my understanding of how the configuration is supposed to work. I can only conclude that maybe the setting in Top level directory trumps the ignore file? No need to follow this up, everything is fine. Cheers, Nick

jy-gh commented 1 year ago

Nick,

I misread your previous comment, sorry.

But the point is that whatever the logic, I did have /Applications in my ignore file (see previous attachment) and as top level directory, and RF still found recent files from /Applications. If I took it out of the ignore file RF refused to run. I put that in the past tense because for whatever reason I have removed it from the ignore file again and now it does still run. I probably left a typo in the previous ignore file.

Hmm. That's interesting, and I don't have a ready explanation. If this were to happen again, run Recent Files with the debugger active, capture the error--there should be one--and send it to me.

This is only an academic question, but it's an anomaly that confounds my understanding of how the configuration is supposed to work. I can only conclude that maybe the setting in Top level directory trumps the ignore file? No need to follow this up, everything is fine.

I think you are right about the top-level directory having priority. I did a few experiments with an ignore file using fd (not recent_files).

With an ignore file like so:

/Applications

I was still able to display files in the /Applications directory with the command

fd --search-path '/Applications' --ignore-file example_ignore_file.txt

which illustrates your point.

If I changed the ignore file to this:

/Applications/*

I received no results. My guess is that this ignore file rule applies where the other one didn't because the '*' is dynamically expanded and so forces a comparison every time that fd tries to traverse a directory, but this is just speculation on my part.

While I was looking into this, I also did a little reading, and saw something that might be useful for you.

A common pattern with these ignore files is to do something like this, and I'll explain the strategy:

/Applications/*
!/Applications/Safari.app

The first line, of course, blocks everything in the /Applications directory, as we've seen. The second line, however, uses '!' as negation operator, which allows for results from the /Applications/Safari.app (only) folder to display. This is a technique that might be really useful for you. Testing it using the above fd command does indeed show results.

Now, combining that with a similar negation rule, one such as !*.txt, didn't work for me, so this approach may not ultimately help you.

Here's a resource that documents how the pattern matching works in Git, which is where the ignore file format and rules originate:

https://git-scm.com/docs/gitignore#_pattern_format

-jy-gh

NicholasSloan commented 1 year ago

All useful information, thank you.

In case you wondered why I should be so silly as to include /Library in the ignore file in the first place, it was because (perhaps because of the coincidence between the example and my use case) I mistakenly took the sequence of the ignore file to mean "These are the directories in which the following files should be ignored." Of course the real meaning is much more direct and useful. Just the anomaly in practice threw me a little. Enough on that…

I'm thinking about your icon and will get back to you when I have something. Nick

NicholasSloan commented 1 year ago

I have been experimenting with some icon ideas for your workflow. The concept that works best for me is a collection of rectangles, representing files, spiralling back into the distance. It cannot be very detailed if it is to show up clearly in the Alfred prefs sidebar. Also, black outlines, which I used originally, do not show up well there.

As far as I know, the only two places these icons are really seen are Alfred prefs (against dark purple) and the Alfred Gallery (against white). It's difficult to make them look good in both places. I started with a png with transparent background, which looked fine in the prefs sidebar but would look rubbish in the gallery. I ended up trying to match the Alfred purple for the background, but it would be better to have a version for each if that is possible.

Let me know if you want me to take this any further. Changes, different file formats etc. All the best, Nick

RF icon-4a

jy-gh commented 1 year ago

Nick,

I like the idea and the image. Do I just convert it to a .png and use it? (That seems to work, but it may not be the right way.)

Thank you for this!

-jy-gh

jy-gh commented 1 year ago

Closing this issue as it has resulted in enhancements to the Recent Files workflow.