RedeMocambos / baobaxia

Aplicação para publicar conteúdos na rede Baobaxia
GNU General Public License v3.0
37 stars 9 forks source link

Tags don't work with accents #142

Closed agger-magenta closed 9 years ago

agger-magenta commented 9 years ago

There's a problem mapping Unicode characters to Popen().

Tags with accents are not stored properly as git-annex metatags. Investigating.

agger-magenta commented 9 years ago

Fixed.

LANG must be set in gunicorn-startup.sh. The locale has been specified as pt_BR.UTF-8 in the default gunicorn startup file.

This will work on new installations (though they must change the locale if they wish to use another one), but this file must be updated on all existing installations.

There was also a problem when importing. Apparently the database interface doesn't like 8-bit strings. Looking up tags wih unicode in create_objects_from_files, which works.

agger-magenta commented 9 years ago

This fixed setting the tag on the local git-annex instance, but it's still broken.

This is the git-annex metadata for a single image on abdias:

exu@abdias:/data/bbx/repositories/mocambos/abdias/imagem/15/04/22$ git annex metadata pqd-territorios-digitais-livres-fd0d0.jpg 
metadata pqd-territorios-digitais-livres-fd0d0.jpg 
  b0d80df3-a5ce-4177-8e87-22a856bf70df-tag=Digital
  b0d80df3-a5ce-4177-8e87-22a856bf70df-tag=Pajelança
  b0d80df3-a5ce-4177-8e87-22a856bf70df-tag=Quilombólica
  b0d80df3-a5ce-4177-8e87-22a856bf70df-tag=Tainã
  b0d80df3-a5ce-4177-8e87-22a856bf70df-tag-lastchanged=2015-04-22@23-14-34
  lastchanged=2015-04-22@23-14-34
ok

This is the same command for the same file after running bbx-cron.sh:

(bbx)exu@hyndla:/data/bbx/repositories/mocambos/abdias/imagem/15/04/22$ git annex metadata pqd-territorios-digitais-livres-fd0d0.jpg
metadata pqd-territorios-digitais-livres-fd0d0.jpg 
  b0d80df3-a5ce-4177-8e87-22a856bf70df-tag=Digital
  b0d80df3-a5ce-4177-8e87-22a856bf70df-tag=Pajelança
  b0d80df3-a5ce-4177-8e87-22a856bf70df-tag=Quilombólica
  b0d80df3-a5ce-4177-8e87-22a856bf70df-tag=Tainã
  b0d80df3-a5ce-4177-8e87-22a856bf70df-tag-lastchanged=2015-04-22@23-14-34
  lastchanged=2015-04-22@23-14-34
ok

Maybe git-annex doesn't handle 8-bit characters in metadata well - it looks like it. I'll investigate, and if I don't find anything, I'll email Joey Hess.

agger-magenta commented 9 years ago

I emailed Joey, and he said:

"This sounds like a bug that I fixed in version 5.20150317 of git-annex."

To make tags work, all mucuas should be upgraded to git-annex 5.20150317 or newer!

agger-magenta commented 9 years ago

It works if you upgrade with the git-annex version in this archive:

http://neuro.debian.net/install_pkg.html?p=git-annex-standalone

A mucua with this version installed may now receive tags correctly.

The previously mentioned update, to the gunicorn configuration, is still necessary in order to write the tags correctly.

20 commented 9 years ago

It seems that interface doens't accept accent in tags. Verified on dpadua.

agger-magenta commented 9 years ago

I tried associating a tag without accents on an existing media on dpadua.- Succeeded. Tried to do the same with a tag with accents. Failed, with an error message that name and namespace were not unique.

Tried to do the same on hyndla - both operations succeeded. This is strange. Was it the same use case you tried too? How do you reproduce the error to verify?

20 commented 9 years ago

I suppose it's related to a python 2.7 bug with subprocess: http://stackoverflow.com/questions/1910275/unicode-filenames-on-windows-with-python-subprocess-popen

Which python version do you have on hyndla bbx virtualenv?

We should find a workaround until moving to python 3

agger-magenta commented 9 years ago

Hyndla's Python version is 2.7.6.