VTUL / vtechworks

DSpace at Virginia Tech
http://vtechworks.lib.vt.edu
Other
6 stars 8 forks source link

Unable to change filename in Edit Item view #738

Closed alawvt closed 3 years ago

alawvt commented 3 years ago

@pyc1 and I report:

When we attempt to change a file name in the "Edit Item"->"Edit Bitstream" form, https://vtechworks.lib.vt.edu/admin/item, the new file name is not saved after we click "Save."

The same thing occurs on vtechworks-dev.

alawvt commented 3 years ago

I am able to change a filename of a bitstream in my LDE, running, I believe, the same commit that is on prod and dev.

pyc1 commented 3 years ago

This issue includes the ability to change the file description. For example, if I type "Transcript" in the description field, and then click Save, it does not save.

alawvt commented 3 years ago

Thanks, @pyc1, for that information. @soumikgh and I worked on this issue yesterday and found that on both dev and prod, we are able to change the file name of newly created items but not those items created last week or older. I plan to look at it more in the LDE today.

alawvt commented 3 years ago

The file name and file description are in the the mtetadatavalue table of the database. For example, in my LDE I uploaded the file, Lawrence_AS_D_2017.pdf. In the Edit Item interface I gave it the description, Lawrence_AS_D_2017 description and then renamed it Lawrence_AS_D_2017b.pdf. Apparently, it keeps both the original file name (in dc.source, metadata_field_id 55) and the newest file name (in dc.title, metadata_field_id 64) in the table. Only the latest file description is kept( in dc.description, metadata_field_id 26) in the table.

metadata_value_id metadata_field_id text_value text_lang place authority confidence dspace_object_id
6049167 55 Lawrence_AS_D_2017.pdf NULL 0 NULL -1 bfc4842c-cdb3-43be-987b-0167b656fa71
6049179 26 Lawrence_AS_D_2017 description NULL 0 NULL -1 bfc4842c-cdb3-43be-987b-0167b656fa71
6049180 64 Lawrence_AS_D_2017b.pdf NULL 0 NULL -1 bfc4842c-cdb3-43be-987b-0167b656fa71
alawvt commented 3 years ago

I created item, test of changing a file name 2020-12-21 on 2020-12-21. That day, I was able to change the file name from Lawrence_AS_D_2017.pdf to Lawrence_AS_D_2017_new.pdf. Today, 2020-12-23, I am unable to change that file name or add a file description. This is occurring on dev and prod but not in my LDE. Since the file name and file description are both stored in the database, I am suspicious that this is caused by our recently implemented update_metadata_language_code.sh cron job. @keithgee or @soumikgh, do you have ideas about this?

alawvt commented 3 years ago

There is a text_lang associated with the file name and file description. I ran the script on my LDE and the table is now,

metadata_value_id metadata_field_id text_value text_lang place authority confidence dspace_object_id
6049167 55 Lawrence_AS_D_2017.pdf en 0 NULL -1 bfc4842c-cdb3-43be-987b-0167b656fa71
6049185 26 Lawrence_AS_D_2017 description even newer en 0 NULL -1 bfc4842c-cdb3-43be-987b-0167b656fa71
6049186 64 Lawrence_AS_D_2017d.pdf en 0 NULL -1 bfc4842c-cdb3-43be-987b-0167b656fa71

with text_lang=en.

I am now unable to change the file name or file description for this item.

keithgee commented 3 years ago

@alawvt, excellent investigative work. It seems we'll need to change the text_lang back to NULL for filenames and descriptions. We'll also need to change the SQL in the script so that it doesn't update text_lang for filenames and descriptions, if you want to continue to use it.

keithgee commented 3 years ago

Oops..Apologies to Aaron Law. I meant to tag my colleague. @alawvt let's see if we can straighten this out after standup or sometime in the afternoon if you like.

alawvt commented 3 years ago

@keithgee, thank you for reviewing this issue. We would definitely like to keep this script.

We can test change the language code back to NULL for the file name and file description to make sure that cures the problem.

Then we can modify the script to only operate on item metadata.

And we can figure out how to test if this has had any effect on other things like bundles, collections, and communities.

Then we can write a script to change all the bitstream metadata on the three servers back to NULL, assuming that cures the problem.

alawvt commented 3 years ago

Continues on Modify update_metadata_language_code.sh to only act on items and remedy on dev, pprd, and prod

alawvt commented 3 years ago

That's interesting: my attempts on my LDE to change the file name and file description were not processed in the Edit Item form. However, they show up as additions in the metadatavalue table:

metadata_value_id metadata_field_id text_value text_lang place authority confidence dspace_object_id
6049167 55 Lawrence_AS_D_2017.pdf en 0 NULL -1 bfc4842c-cdb3-43be-987b-0167b656fa71
6049185 26 Lawrence_AS_D_2017 description even newer en 0 NULL -1 bfc4842c-cdb3-43be-987b-0167b656fa71
6049186 64 Lawrence_AS_D_2017d.pdf en 0 NULL -1 bfc4842c-cdb3-43be-987b-0167b656fa71
6049191 26 Lawrence_AS_D_2017 description even newer NULL 1 NULL -1 bfc4842c-cdb3-43be-987b-0167b656fa71
6049192 64 Lawrence_AS_D_2017e.pdf NULL 1 NULL -1 bfc4842c-cdb3-43be-987b-0167b656fa71

We're going to need to watch for multiple values of the same field for a bitstream etc., undoing this script.

keithgee commented 3 years ago

The above table from your LDE helps to explain what's happening, too. There's a bit of code in Bitstream.java, line 135-137.

public void setName(Context context, String n) throws SQLException {
    getBitstreamService().setMetadataSingleValue(context, this, MetadataSchema.DC_SCHEMA, "title", null, null, n);
}

The two null parameters in the call to setMetadataSingleValue - one of them is the language.

setMetaDataSingleValue is used where there's expected to be 0 or 1 matching pieces of metadata, for example, the "title", or name of a single file. It works by first looking for existing metadata (within the context of the bitstream, in this case) that matches the schema, element, qualifier, and language given, deleting that existing metadata, and then adding a new piece of metadata with that schema, element, qualifier, language, and the new value.

Because, in the old metadata, the language is unexpectedly set to "en" instead of NULL, that piece of metadata is never found and never deleted. DSpace continues along and merrily adds the new, updated filename as new metadata.

When the filename is displayed again on the view pages, it's only expecting there to be one value for the filename. But now there are multiple; it displays the first one it finds - and this seems to be the one with the lower ID, the old name, at least that we've seen so far.

We could change the internals of how DSpace handles updating filenames and file descriptions, but I think the plan we made to change the script to not touch metadata that's not associated with items, and, separately, for us to change language code in metadata that's not associated directly with items back to NULL is probably a better idea in the long run.

Edited to add: The reasons I think it's a better idea in the long run to stick with the original plan are so that we don't diverge too far from the main DSpace project - making future updates take longer, also because we don't know yet what future versions of DSpace will do, and less so, because I'm a little nervous about accidentally breaking something else that's built on assumptions that we don't know about yet.

alawvt commented 3 years ago

@keithgee, thank you very much for explaining how DSpace updates a bitstream filename or other non-item metadata. This explains why there are multiple filesnames, but only the older one displays. Since we only want to edit the language code for item metadata, I'd like to limit our fix to that and not change the setMetadataSingleValue. We do have the issue now of a few items having multiple filenames for the same bitstream. I don't know how we'd easily remedy that, but I believe it would be worth having those wrong to fix all the rest. Thanks!

alawvt commented 3 years ago

This has been continued and resolved in Modify update_metadata_language_code.sh to only act on items and remedy on dev, pprd, and prod.