Closed oderyn closed 11 months ago
Here's the latest.
It looks like I can add multiple tags, but they aren't showing up -- OR I am not doing it properly.
Here's the code that seems to be working:
from sist2 import Sist2Index
import sys
index = Sist2Index(sys.argv[1])
for doc in index.document_iter():
# Check if 'tag' exists in the document's json_data
if 'tag' in doc.json_data:
# If it does, extend the list of tags
print("Extend!")
print(doc.json_data["tag"])
doc.json_data["tag"].extend(["onion.#ffffff"])
else:
# If it doesn't, create a new list with both tags
doc.json_data["tag"] = ["hamburger.#00FF00", "pickles.#00FF00"]
print("Add!")
index.update_document(doc)
index.sync_tag_table()
index.commit()
print("Done!")
However, the tags do not display in the UI. Only in the task log:
[ADMIN ] Starting user script with executable='/sist2-admin/scripts/test/run.sh', index_path='/sist2-admin/scan-test-2023-09-04 14:48:08.730895.sist2', extra_args=''
[INFO ] Extend!
[INFO ] ['hamburger.#ffFF00', 'pickles.#00FFff', 'onion.#ffffff']
[INFO ] Extend!
[INFO ] ['hamburger.#ffFF00', 'pickles.#00FFff', 'onion.#ffffff']
[INFO ] Extend!
[INFO ] ['hamburger.#ffFF00', 'pickles.#00FFff', 'onion.#ffffff']
[INFO ] Extend!
[INFO ] ['hamburger.#ffFF00', 'pickles.#00FFff', 'onion.#ffffff']
[INFO ] Extend!
[INFO ] ['hamburger.#ffFF00', 'pickles.#00FFff', 'onion.#ffffff']
[INFO ] Extend!
[INFO ] ['hamburger.#ffFF00', 'pickles.#00FFff', 'onion.#ffffff']
[INFO ] Extend!
[INFO ] ['hamburger.#ffFF00', 'pickles.#00FFff', 'onion.#ffffff']
[INFO ] Extend!
[INFO ] ['hamburger.#ffFF00', 'pickles.#00FFff', 'onion.#ffffff']
[INFO ] Done!
If I view the info about an item, I get this:
Key | Value |
---|---|
index | [test] |
mtime | 2020-04-25 |
mime | text/html |
size | 2.4k |
path | |
tag | [ "hamburger.#00FF00" ] |
Even though printing to the log in the script shows all of the values (see above).
It is also interesting to note that if I add a tag through the UI, it does not show in the list of tags that are printed to the log.
Another interesting things to note: If I remove the first item in the list using something like:
doc.json_data["tag"].remove('hamburger.#ffFF00')
The next item in the list displays.
And if I add a tag through the UI, it will not display when I
print(doc.json_data["tag"])
My use case might be helpful to know.
I've got a few thousand documents whose filenames follow a pattern. I want to break the filenames down into their disparate parts and use each element to create a hierarchical tagging system. Like so:
category1
keyword1
tag1
tag2
tag3
keyword2
tag4
tag5
tag6
keyword3
tag7
tag8
tag9
category2.1
keyword2.1
tag2.1
tag2.2
tag2.3
keyword2.2
tag2.4
tag2.5
tag2.6
keyword2.3
tag2.7
tag2.8
tag2.9
The "tags" are the elements in the url and I would be creating the hierarchy in the script. The script itself is working -- as far as I can tell. It is just not applying all of the tags (6 of them) that I want to pull from the filenames. Hopefully that context is helpful.
To add to this question, do you still use periods as hierarchy separators? so "category1.keyword1.tag1" adds the 'tag1' tag?
Thanks @oderyn, it might be a bug in the 3.2.x code, as i said it's not really been tested thoroughly yet.
in theory this should work:
doc.json_data["tag"] = ["hamburger.#00FF00", "pickles.#00FF00"]
To add to this question, do you still use periods as hierarchy separators? so "category1.keyword1.tag1" adds the 'tag1' tag?
yes
Thanks @oderyn, it might be a bug in the 3.2.x code, as i said it's not really been tested thoroughly yet.
Cool. Happy to be a guinea pig. :)
in theory this should work:
doc.json_data["tag"] = ["hamburger.#00FF00", "pickles.#00FF00"]
I just tested again. The first value is added to the array and displays in the UI. The second value is also added to the json, but not showing up in the UI.
Also, tags added through the UI are not showing up in the json (based on the readout in the task console when I print the doc.json_data["tag"]).
Lastly, it appears that the above command will replace all the items in the tag array. Not sure if that is intentional or not -- or just part of how Python works.
I hope this was helpful.
Should be fixed in the latest docker tag!
sist2 version: 3.2.1
Platform (Linux or Docker, x86-64 or arm64): Docker
Elasticsearch version: 7.17.9
--
I want to add multiple tags to my files through user scripts. I've been testing with the hamburger example, and cannot figure it out. I also don't any info about this in the docs. Apologies if I have overlooked it.
Here is what I've tried. Note that I did not try everything at the same time. The "--" indicates different attempts.
Any advice on how to troubleshoot or pointers on what I am doing wrong would be greatly appreciated.