Open plakitkelly opened 6 months ago
@KaarelKa might it be similar as this issue: https://github.com/keeleinstituut/tv-tolkevarav/issues/634
Probably same issue yeah, will check and fix
@MariusJulius @thenouan It seems this doesn't work as well as I hoped. It does get some information, but we can't access Content-Disposition header on FE, so we will just get the file type, which is text/html in most cases. For a lot of these the file will work with a .zip extension as well, in which case the zip will show all the included files. with .txt it will just show one of them.
Basically if BE would allow us to access the Content-Disposition header, we can make sure we always use the correct extension
Access-Control-Expose-Headers: Content-Disposition
Alternative is that FE will assume the extension based on the endpoint, but this can only be implemented for endpoints that always return the file in the same format, which doesn't seem to be the case for most of the file download endpoints. If we get a lis of the ones that do, we can skip making Content-Disposition visible for the browser for those endpoints
Added a minor FE improvements, to increase the likelyhood of the user being able to open the file for now
FE side currently blocked, will need Access-Control-Expose-Headers: Content-Disposition for the response so we can determine the correct header on FE client side
Relevant downloads for this task: @MariusJulius @thenouan Maybe you can comment on these as well, if you know something
Cat analysis download - (GET /cat-tool/download-volume-analysis/id ). Currently returns txt, but not sure if it can be something else as well. Have seen .zip some time in the past
Download Xliff - (GET translation-order/api/cat-tool/download-xliff/id ). Currently we download as .txt on FE, but seems BE sent .xlf, not sure if this will always be .xlf though, I think this was one of the ones that could also be a .zip
Download translated project - (GET /translation-order/api/cat-tool/download-translated/id). Currently FE downloads as .txt. If I remember correctly it can also be a .zip and I currently see that BE sends it as .bin at least for 1 project
Audit logs (GET event-records/export) - exported as csv from FE, I don't think we can have any other file formats here
Export project csv - (GET /projects/export-csv) - Will change this to .csv on FE, assuming it can't be anything else ?
Export translation memory - (GET /tm/export/file/id) - exported as .tmx from FE, I don't think we can have any other file formats here
Export users - (GET /institution-users/export-csv) - exported as csv from FE, I don't think we can have any other file formats here
"3 dot menu has options related to matecat. according to matecat documentation if there is more than 1 project file it gives zip" - Tried with the xliff download and translated project download, didn't seem to work. When I had 2 sub project files + 2 files in the "Lähtefailid tõlketööriistas"
-- "Laadi alla xliff" (2.) - BE still sent the .xlf file extension, not zip -- "Laadi alla valmis tõlge (3.) - BE still sent the .bin file extension, not zip
Not 100% sure but: 1) should be TXT (even if splitted info is shown in one file) 2) should be XLF (even if splitted info is shown in one file) 3) should be same format which was added (e.g docx then docx: https://guides.matecat.com/finalising-a-project - downloaded file from OIU-2023-11--102 which was in txt, changed to docx and it opened.)
I tried to reproduce downloading zip file, but I couldn't reproduce it.
BUT created order with multiple source files and downloaded translated file
As MJ said, it should download zip file for multiple files. It downloaded txt file with the content of a zip file
In right you see zip file, that I opened with notepad. And in left you can see just downloaded "tõlgitud fail" that came as txt. Content is so similar, so I think that system made a zip content, but system forces to download txt file
And I opened that broken zip file with notepad, its content is the translated text. So, in 22.12, when I downloaded "broken" file, system gives zip file with txt content. That's why I couldn't open or unzip the zip file
Yeah, you are correct, it arrives as .bin from BE, but client side app is not allowed to see the type of the file and has to make some decision what format to force it to. In this case the ".bin" file can work either as a zip or a txt file from what I've seen, so @thenouan even if you add the Access-Control-Expose-Headers: Content-Disposition header to the response, we still won't be able to tell in this specific case, whether we should force the .bin to zip or txt
Actually I can try to add some optimistic workaround for these on FE.
Update
Okay it seems we get "cat_files" from BE for each subproject, which should be enough to do the check. Will do some testing, but hopefully it will work
Okay it seemed to work, based on my testing at least
@KaarelKa Did you solve 3rd?
I translated docx file, and its content is as in screenshot
It's word document with notepad.
If I tried to save this to docx, it doesn't work.. But I opened txt file in word, it works.
So, it's not solution to download txt or zip. It should download with same format as it was uploaded
Yeah, got a workaround for this as well. Basically
I tried to find any tm, where it downloads only one tmx file, but I couldn't find. but with some tm it exports an empty zip (actually tmx, but opened with 7zip)
I don't understand what does it mean that it downloads many tmx files, maybe I imported many tmx files, but is it correct that one tm contains many tmx files?
@plakitkelly Currently filtering the logs doesn't work, test again nr 4 other files are ok (except 6)
Log - OK
Not deployed yet it seems
@plakitkelly Deploy was just done now, can you test again ?
Tested om 08.01 - 6 ok. All OK
Laadi alla valmis tõlge from three dot menu File is broken