novoid / filetags

Management of simple tags within file names
GNU General Public License v3.0
256 stars 37 forks source link

Tagtree Attribute Error: Property '<unknown>.Targetpath' can not be set in Windows 11 #70

Closed chaicurioquest closed 1 month ago

chaicurioquest commented 3 months ago

@novoid Hi, I am getting the following error "AttributeError: Property '<unknown>.Targetpath' can not be set" while running the filetag with tagtree in recursive mode.

filetags --verbose --tagtrees --recursive --tagtrees-handle-no-tag no-tags --tagtrees-dir=E:\archive\.filetags_tagtree

The screenshot is below,

Screenshot 2024-07-01 000958

Thanks.

chaicurioquest commented 1 month ago

@novoid Hi, Even when I run D:\Python\Python312\Scripts\filetags.exe" --tagtrees --tagtrees-depth 3, I am getting following error, image Any help? Thanks.

chaicurioquest commented 1 month ago

I have narrowed the issue, if the file name has special character(s) as below, it's throwing an error,

Is there any way to handle error exception when special character in file name?

novoid commented 1 month ago

Well, I created your file names on my Linux system and can not reproduce the issue. Unfortunately, I don't have access to Windows.

novoid commented 1 month ago

Here is my output in case somebody with access to Windows would like to volunteer:

vk@sting ~2d/2024-09-15 filetags issue 70/files % filetags --tagtrees --verbose --tagtrees-dir=../tagtrees                                                                                                                    [15/1907]
DEBUG    2024-09-15 16:57:32,974 extracting list of files ...
DEBUG    2024-09-15 16:57:32,974 len(options.files) [0]
DEBUG    2024-09-15 16:57:32,974 0 filenames found: []
DEBUG    2024-09-15 16:57:32,974 reported console width: 231 and height: 58   (80/80 is the fall-back)
DEBUG    2024-09-15 16:57:32,974 locate_and_parse_controlled_vocabulary: called with startfile: "False"
DEBUG    2024-09-15 16:57:32,974 locate_and_parse_controlled_vocabulary: called in cwd: /home/vk/tmp/2del/2024-09-15 filetags issue 70/files
DEBUG    2024-09-15 16:57:32,974 locate_file_in_cwd_and_parent_directories: called with startfile "/home/vk/tmp/2del/2024-09-15 filetags issue 70/files" and filename ".filetags" ..                                          
DEBUG    2024-09-15 16:57:32,974 locate_file_in_cwd_and_parent_directories: startfile [/home/vk/tmp/2del/2024-09-15 filetags issue 70/files] is a directory, using it as starting_dir [/home/vk/tmp/2del/2024-09-15 filetags issue 70/f
iles] .....                                            
DEBUG    2024-09-15 16:57:32,974 locate_file_in_cwd_and_parent_directories: looking for ".filetags" in directory "/home/vk/tmp/2del/2024-09-15 filetags issue 70" .......
DEBUG    2024-09-15 16:57:32,974 locate_file_in_cwd_and_parent_directories: found ".filetags" in directory "/home/vk" ........
DEBUG    2024-09-15 16:57:32,974 locate_and_parse_controlled_vocabulary: locate_file_in_cwd_and_parent_directories returned: /home/vk/.filetags
DEBUG    2024-09-15 16:57:32,974 locate_and_parse_controlled_vocabulary: .filetags found: /home/vk/.filetags
DEBUG    2024-09-15 16:57:32,974 locate_and_parse_controlled_vocabulary: found controlled vocabulary

[...]

DEBUG    2024-09-15 16:57:32,974 handling option for tagtrees
DEBUG    2024-09-15 16:57:32,974 User overrides the default tagtrees directory to: ../tagtrees
DEBUG    2024-09-15 16:57:32,974 found old tagfilter directory "../tagtrees"; deleting directory ...
DEBUG    2024-09-15 16:57:32,975 re-creating tagfilter directory "../tagtrees" ...
DEBUG    2024-09-15 16:57:32,975 get_files_of_directory(/home/vk/tmp/2del/2024-09-15 filetags issue 70/files) called and traversing file system ...
DEBUG    2024-09-15 16:57:32,975 get_files_of_directory(/home/vk/tmp/2del/2024-09-15 filetags issue 70/files) finished with 2 items
DEBUG    2024-09-15 16:57:32,975 locate_file_in_cwd_and_parent_directories: called with startfile "/home/vk/tmp/2del/2024-09-15 filetags issue 70/files" and filename ".filetags" ..
DEBUG    2024-09-15 16:57:32,975 locate_file_in_cwd_and_parent_directories: startfile [/home/vk/tmp/2del/2024-09-15 filetags issue 70/files] is a directory, using it as starting_dir [/home/vk/tmp/2del/2024-09-15 filetags issue 70/f
iles] .....
DEBUG    2024-09-15 16:57:32,975 locate_file_in_cwd_and_parent_directories: looking for ".filetags" in directory "/home/vk/tmp/2del/2024-09-15 filetags issue 70" .......
DEBUG    2024-09-15 16:57:32,975 locate_file_in_cwd_and_parent_directories: found ".filetags" in directory "/home/vk" ........
DEBUG    2024-09-15 16:57:32,975 generate_tagtrees: I found controlled_vocabulary_filename "/home/vk/.filetags" which I'm going to link to the tagtrees folder
DEBUG    2024-09-15 16:57:32,975 create_link(/home/vk/.filetags, ../tagtrees/.filetags) called
INFO     2024-09-15 16:57:32,975 Creating tagtrees and their links. It may take a while …  (exponentially with respect to number of tags)
DEBUG    2024-09-15 16:57:32,975 get_tags_from_files_and_subfolders called with startdir [/home/vk/tmp/2del/2024-09-15 filetags issue 70/files], cached startdirs [0]
DEBUG    2024-09-15 16:57:32,975 get_tags_from_files_and_subfolders: Writing 1 tags in cache for directory: /home/vk/tmp/2del/2024-09-15 filetags issue 70/files
DEBUG    2024-09-15 16:57:32,975 generate_tagtrees: handling file "/home/vk/tmp/2del/2024-09-15 filetags issue 70/files/Nato Science Series II_ №88] André Anders (auth.), Efim Oks, Ian Brown (eds.) - Emerging Applications of Vacuum
-Arc-Produced Plasma, Ion and Electron Beams (2002, Springer) -- foo.pdf" …
DEBUG    2024-09-15 16:57:32,975 create_link(/home/vk/tmp/2del/2024-09-15 filetags issue 70/files/Nato Science Series II_ №88] André Anders (auth.), Efim Oks, Ian Brown (eds.) - Emerging Applications of Vacuum-Arc-Produced Plasma, 
Ion and Electron Beams (2002, Springer) -- foo.pdf, ../tagtrees/foo/Nato Science Series II_ №88] André Anders (auth.), Efim Oks, Ian Brown (eds.) - Emerging Applications of Vacuum-Arc-Produced Plasma, Ion and Electron Beams (2002, 
Springer) -- foo.pdf) called
DEBUG    2024-09-15 16:57:32,975 generate_tagtrees: handling file "/home/vk/tmp/2del/2024-09-15 filetags issue 70/files/NTU GIEE|Graduate Institute of Electronics Engineering -- foo.pdf" …
DEBUG    2024-09-15 16:57:32,975 create_link(/home/vk/tmp/2del/2024-09-15 filetags issue 70/files/NTU GIEE|Graduate Institute of Electronics Engineering -- foo.pdf, ../tagtrees/foo/NTU GIEE|Graduate Institute of Electronics Engin
eering -- foo.pdf) called
INFO     2024-09-15 16:57:32,975 Number of links created in "../tagtrees" for the 2 files: 2  (tagtrees depth is 2) 
DEBUG    2024-09-15 16:57:32,975 platform.system() is: [Linux]
DEBUG    2024-09-15 16:57:33,493 successfully finished.
vk@sting ~2d/2024-09-15 filetags issue 70/files %
nbehrnd commented 1 month ago

@chaicurioquest I presume the greater picture of tagging the pdf files here is to organize (academic) papers. If this were the case, I suggest to invest some time into reference managers; typically, larger research/university libraries host workshops to present their pros and cons, and how to use them efficiently. There is plenty material for self study, too.

One example of them is zotero which is open source, freely available, cross-platform. It allows to manage the fetch (and completion) of the pdf and their bibliographic data where possible, to mark and annotate the documents, to organize the material in hierarchies and eventually help you to write a new report/paper with a word processor like Word, Writer or a text processor like a LaTeX engine, Markdown or pandoc based workflow. This can be very useful in many ways -- regardless if the collection(s) of pdf are your library, or partially shared collection within a group. An example recently seen how zotero integrates the concept of tags is Mastering Zotero: Working with Tags by Kris Joseph, York University, Canada as part of a series of installments.

I speculate the suggestive bold about №88 is about single character itself. This isn't part of original ASCII (7-bit), but one of later character code page Windows-1251 extension to support some Cyrillic characters Windows-1252 (about Latin Script) does not. Any objections to process `№88 information as volume 88 of a journal, or book series which then is then easier understood and processed by a reference manager regardless which character code page is used (it should be a lesser issue with UTF-8 character encoding).

novoid commented 1 month ago

I presume the greater picture of tagging the pdf files here is to organize (academic) papers.

No.

filetags is a tool that offers simple file tagging. It is method-agnostic and doesn't imply any particular method. You can think of many multi-classification workflows that may be implemented also by filetags. Only if you need concepts not implemented in filetags such as tag inheritance, categorizing vs. describing tags, ... you do face limitations of filetags.

Back to the issue at hand: my guess would be that Windows does introduce issues by following non-UTF-8 charsets where GNU/Linux systems do not have an issue with. However, for further analysis, I'd need access to a Windows license and at least a VM or such. My limited time budget doesn't allow for a setup process associated with that (my guess would be at least an hour for a VM) with unclear result.

You can - however - try to drill down the problem by removing special characters from your file name and write down here after which step the issue was gone. E.g., removing the №88 would be one of the first things I'd try.

HTH

nbehrnd commented 1 month ago

@novoid One point is filetags set to tag regardless of the file and agnostic of file type to be processed. I agree on this one.

On the other hand, if I fetch a paper by link, or the doi/a resolver, I let zotero digest the pdf. Here, zotero both collects the bibliographic data and automatically renames the file based on the pdf metadata. The intent is related to guess-filename.py, only different keywords are accessed here. 2017_Elgrishi_Rountree.pdf is an example of such a predictable pattern to consist only of ASCII characters, digits 0 to 9, and underscores -- no spaces, no umlauts, no accents -- regardless of the source of the pdf. The resulting pattern is adjustable; it equally is possible e.g., to pick the year of publication, only one name of the authors, and a couple of words of the title. Zotero can process batches of pdf in this way (including papers already managed and stored under a different pattern of file names). The name of the file in zotero's storage needn't be the key how it is eventually called when writing a paper.

This new file name is simple enough to not cause a problem for tagging by filetags engaged independently to the mechanism by zotero. And -- if wanted -- to later reconnect the pdf in question to the particular record in the zotero database.

@chaicurioquest Instead of 10.1007/978-94-010-0277-6 as doi of the whole book, an individual chapter (or its pdf) with its doi (e.g., 10.1007/978-94-010-0277-6_3) may be more specific.

chaicurioquest commented 1 month ago

@nbehrnd Thank you for the valuable suggestion. I rarely used Zotero before for article reference management. My file & folder structure was quite normal like other Windows user (for e.g. different hard drive for different categories of work flow such as C drive for Windows files, D drive for work and E drive for backup).

Big thanks to @novoid for contemplating his ideas into effective PIM. Before adopting into file taging I did some research on different file management approaches like The PARA Method, Johnny.Decimal and few other techniques. I was badly in need of managing my digital data since it was kind of mess piled up for more than decade. Even I have bought Google Drive space for sync up my data. I found filetaging is intuitive, straight forward, built naturally over the years. Once I have got an idea of filetaging, I tried ISO formatted file name (yyyy-mm-dd), tag filter and so on from various blogs from @novoid. Recently, I had a requirement of managing notes. I stumbled upon emacs-org features, and I am figuring out to adopt the workflow for note-taking, linked notes(Org-Roam) and visualization(Org-Roam-Graph) and Advanced Citation Management(Org-Ref) similar to Obsidian. Still, I have more things to streamline this task. Coming to Zotero, I recently tried Zotero for articles references with tags (thanks again Karl Voit for introducing tags) as shown in Fig 1. @nbehrnd As you have mentioned, Zotero is quite useful for managing articles and could be part of the workflow. image Fig 1.

I am trying to integrate Karl Voit PIM method (for e.g. maintaing ISO file naming Fig 2 & 3) system with other workflows (1. Zotero for article references 2. Emacs org linking with Zotero references including annotation, notes, tags through Better BibTeX (For exporting library collections and bibliographies) for Zotero plugin, Zotfile (For extracting annotations and notes) and org-zotxt (For linking and inserting Zotero citations in Org mode) 3. Emacs org for task management, calendar and so on) image Fig 2.
image Fig 3. I am a beginner at managing digital files and workflows in this approach. Excuse if any points are not accurate/understandable.

This new file name is simple enough to not cause a problem for tagging by filetags engaged independently to the mechanism by zotero. And -- if wanted -- to later reconnect the pdf in question to the particular record in the zotero database.. I will try to add tag(s) in article file name (though Zotero GUI retrieve the articles based on tags) while renaming the file to maintain consistency with file-tagging.

Back to the issue: I removed the special characters and it worked.

Thank you, @novoid and @nbehrnd.

novoid commented 1 month ago

Glad that you could make it run.

Sorry, that special characters may introduce such a nasty issue in 2024. :pensive:

I honestly don't get the Zotero discussion here. (That's not intended as criticism.)

A common pattern in my workflows is that I either (A) manage meta-data of files in my Org-mode or (B) in the file system with optional meta-data within Org-mode. I rarely manage meta-data in both spots.

For (A), I mostly try to get any unique file name, not renaming the whole thing to my liking as long as I get an idea what it contains. Using path-independent links https://karl-voit.at/2022/02/10/lfile/ I link them at places in my Org-mode where I usually start the retrieval process for those type of files.

Examples for (B) are my own photographs, cliparts, fun files, ...

HTH

I close this issue for now. I may return to it when I have a working Windows environment in the future to analyse the issue any further.