Closed kemartin2015 closed 1 year ago
Hi @helrond, thank you for these suggestions. I worked on addressing your comments, but I still don't think I'm passing client correctly because I'm getting this error pretty consistently:
raise Exception('Could not resolve "{}" to a URI'.format(thingit)) Exception: Could not resolve "Namespace(note_type='accessrestrict', action='delete', resource_id=13291, search_string='restrict', level='all', replace_string=None)" to a URI
I agree that it makes sense to change this script to an object-oriented approach. It's been a challenge to jump back in and make changes with all of the arguments that need to be passed around.
Hi @helrond, I made quite a few changes, and the script is working now. Could you take another look whenever you get a chance?
Hi @helrond, I think I've addressed the issues in your previous round of comments, but I have some more questions now:
main.
Does it make sense to move where resource
is defined and walk_tree
to the main method to alleviate some of the passing issues? I’m also not confident in the logic in main
.handle_matching_notes
even after moving save_record
out. Is that okay or should these lines be moved back into process_tree
? args.resource_id
into process_tree
. Do I need to rename the args.
arguments to prevent that error?Hi @helrond, I'm getting a couple errors when I run the script.
In line 60, i get this error:
TypeError: argument 1 must have a "write" method
In line 26, I was getting the following error. Is the issue with the content
argument? The script runs if I change content
to note
in line 26, but then it modifies/deletes every access note even with a high confidence ratio.
Traceback (most recent call last):
File "/Users/k.martin/Documents/GitHub/scripts/archivessnake/edit_notes.py", line 83, in <module>
main(args.note_type, args.action, args.resource_id, args.search_string, args.level, args.replace_string)
File "/Users/k.martin/Documents/GitHub/scripts/archivessnake/edit_notes.py", line 65, in main
processed_record, updated = process_record(record, level, note_type, action, search_string, replace_string)
File "/Users/k.martin/Documents/GitHub/scripts/archivessnake/edit_notes.py", line 26, in process_record
if text_in_note(content, search_string, CONFIDENCE_RATIO):
File "/opt/homebrew/lib/python3.9/site-packages/asnake/utils/__init__.py", line 117, in text_in_note
note = resolve_to_json(note, client)
File "/opt/homebrew/lib/python3.9/site-packages/asnake/utils/__init__.py", line 49, in resolve_to_json
json = client.get(thingit).json()
AttributeError: 'int' object has no attribute 'get'
OK, there was an error with the way the CSV Writer was being created; I've fixed that now and that resolves the issue in line 60.
The other issue is slightly more complex. It looks like the text_in_note
utility is expecting that you will pass a note object, a query string, an ASpace client and optionally a confidence ratio but instead we are passing subnote content and a query string. HOWEVER we can't just move that conditional further up and update the arguments because you're replacing content in a specific subnote. So it might actually make sense here to go back to the way you had it originally, even though it's somewhat duplicative of what ASnake can do.
Ok, sounds good! Thanks for explaining the text_in_note
issue. That makes a lot of sense.
Hi @helrond, I reverted text_in_note
back to contains_match,
but I don't think your change to csv writer was committed because I'm still getting that error.
Ah, sorry I forgot to push. Changes are there now.
I tested the script on FA739A and FA739B in AS dev, and it looks good to me. I didn't have issues modifying or deleting access notes. The output CVSs matched the original output of the notes_csv,py script, and I didn't see any unexpected changes in other notes or other areas of the finding aid. Is there anything else I should be testing for?
I don't think so? We mostly just moved things around so they're a little more atomic. At some point in the future we could also write tests for these scripts (particularly the ones which are writing data back to AS) so we have a better safety net.
Ok, sounds good! Should I go ahead and merge these changes?
Sure, go ahead and merge!
I tried to address the deprecated ASnake endpoint issue, remove the global declarations, and add more information to the csv file to make it more consistent with the csv file output in the notes_csv.py script.
So far, I'm consistently getting error messages in line 25 where I'm trying to get json for notes each time I try a different approach. Could you offer some guidance on how to fix this line? I'm also unsure about my changes to lines 42 and 52.