Closed kumarvivek1752 closed 6 months ago
ckan's validation code assumes that _
is a field separator when "unflattening" form field names.
For ckanext-fluent to support languages like zh_CN
we'll need to have ckanext-fluent convert the _
s to something else when generating the form field names, and convert them back when storing them as json.
can you guide me on where I've to make changes to support languages that have _
.
I think the lang _
values need to be replaced in the form: https://github.com/ckan/ckanext-fluent/blob/e882c241c57f80ea4e4f1d72f07f2dd64588310e/ckanext/fluent/templates/scheming/form_snippets/fluent_text.html#L5C1
And when being parsed: https://github.com/ckan/ckanext-fluent/blob/e882c241c57f80ea4e4f1d72f07f2dd64588310e/ckanext/fluent/validators.py#L138
thanks for quick reply let me debug it
it is giving invalid language code:
for both zh_Hans_CN
and zh_CN
pdb (zh_CN):
> /srv/app/src/ckan/ckan/lib/navl/dictization_functions.py(305)validate()
-> flat_data, errors = _validate(flattened, schema, validators_context)
(Pdb) _validate(flattened, schema, validators_context)
({('name',): 'vghv', ('owner_org',): '1ab37f13-bf77-43a3-a708-8410d6f18496', ('private',): True, ('tag_string',): '', ('state',): 'draft', ('type',): 'dataset', ('title',): 'vghv', ('title_translated',): <ckan.lib.navl.dictization_functions.Missing object at 0x7f9fed501cc0>, ('notes_translated',): <ckan.lib.navl.dictization_functions.Missing object at 0x7f9fed501cc0>, ('extras', 0, 'key'): 'notes_translated', ('extras', 0, 'value'): <ckan.lib.navl.dictization_functions.Missing object at 0x7f9fed501cc0>, ('extras', 1, 'key'): 'title_translated', ('extras', 1, 'value'): <ckan.lib.navl.dictization_functions.Missing object at 0x7f9fed501cc0>}, {('__before',): [], ('id',): [], ('name',): [], ('title',): [], ('author',): [], ('author_email',): [], ('maintainer',): [], ('maintainer_email',): [], ('license_id',): [], ('notes',): [], ('url',): [], ('version',): [], ('state',): [], ('type',): [], ('owner_org',): [], ('private',): [], ('__extras',): [], ('__junk',): [], ('tag_string',): [], ('plugin_data',): [], ('save',): [], ('return_to',): [], ('title_translated',): [], ('notes_translated',): [], 'notes_translated-zh_CN': ['invalid language code: "zh_CN"'], 'title_translated-zh_CN': ['invalid language code: "zh_CN"']})
(Pdb)
pdb(zh_Hans_CN):
> /srv/app/src/ckan/ckan/lib/navl/dictization_functions.py(305)validate()
-> flat_data, errors = _validate(flattened, schema, validators_context)
(Pdb) _validate(flattened, schema, validators_context)
({('name',): 'hbhjs', ('owner_org',): '1ab37f13-bf77-43a3-a708-8410d6f18496', ('private',): True, ('tag_string',): '', ('state',): 'draft', ('type',): 'dataset', ('title',): 'hbhjs', ('title_translated',): <ckan.lib.navl.dictization_functions.Missing object at 0x7f0fe1609cc0>, ('notes_translated',): <ckan.lib.navl.dictization_functions.Missing object at 0x7f0fe1609cc0>, ('extras', 0, 'key'): 'notes_translated', ('extras', 0, 'value'): <ckan.lib.navl.dictization_functions.Missing object at 0x7f0fe1609cc0>, ('extras', 1, 'key'): 'title_translated', ('extras', 1, 'value'): <ckan.lib.navl.dictization_functions.Missing object at 0x7f0fe1609cc0>}, {('__before',): [], ('id',): [], ('name',): [], ('title',): [], ('author',): [], ('author_email',): [], ('maintainer',): [], ('maintainer_email',): [], ('license_id',): [], ('notes',): [], ('url',): [], ('version',): [], ('state',): [], ('type',): [], ('owner_org',): [], ('private',): [], ('__extras',): [], ('__junk',): [], ('tag_string',): [], ('plugin_data',): [], ('save',): [], ('return_to',): [], ('title_translated',): [], ('notes_translated',): [], 'notes_translated-zh_Hans_CN': ['invalid language code: "zh_Hans_CN"'], 'title_translated-zh_Hans_CN': ['invalid language code: "zh_Hans_CN"']})
(Pdb)
Hello guys,
Maybe I found a solution to this case.
ckanext-fluent/ckanext/fluent/validators.py
Line 16 BCP_47_LANGUAGE = u'^[a-z]{2,8}(-[0-9a-zA-Z]{1,8})*$'
Need to change the expression to accept the '_' too.
BCP_47_LANGUAGE = u'^[a-z]{2,8}([-_][0-9a-zA-Z]{1,8})*$'
After this change, everything works fine to me.
I guess we can't use BCP-47 for languages because the keys we're passed are locale codes(?) which are different. e.g. transifex supports these: https://explore.transifex.com/languages/ and they include suffixes like _TW.Big5
and @latin
so we might need .
and @
too.
Note that for our site we are using BCP-47 for things like en-t-fr
to mark strings automatically translated, so I don't want to drop this completely. Maybe we can get away with accepting both styles of strings? Someone that knows more about localization could weigh in here.
Another thought: It looks like we could convert most of the locale codes ckan will use to BCP-47 for fluent and the API with a .replace('_', '-')
before passing the code in. This way we're representing languages consistently in the API.
@wardi @cicerobcastro thanks for replying i already fix this issue by using just zh
.
CKAN Version: 2.10.1
CKAN Extensions Installed:
ckanext-fluent
,ckanext-schemming
Description:
When testing ckanext-fluent with the
zh_CN
language, I encountered a TypeError: list indices must be integers or slices, not str error. This error does not occur with other languages.The error occurs in the unflatten function in
ckan/lib/navl/dictization_functions.py
on the linecurrent_pos = current_pos[key]
. Here,current_pos
is a list andkey
is likely a string, which leads to the TypeError.Here is the traceback for the error:
pdb data :