Closed martino-vic closed 2 years ago
Language_ID=
is definitely wrong, since the table we talk about is the Language table, so the keyword must be ID here.
But the key error shows that virtual
cannot be passed as keyword, so there must be another way to manipulate the language table or to add the language.
If you check this example here:
You can also define your language object as a Python dictionary (for one language only), and add the "virtual" keyword there (of course, you'd only write "LanguageTable".
Sorry, this is wrong. You need to change the column.
Sorry again: this is even not about the language table, but only about the FormTable, where you'd have to supply your Language_ID. So in this case, the form table has all kinds of columns, and the metadata will specify that the language ID is virtual and the same for all of your data.
But there is a problem with the add_form
or add_form_with_segments
function, as it checks if a Language_ID
has been passed or not, and throws an error if this happens.
Since it is only one column here, we talk about, and it would require substantial workarounds (as far as I can see), I'd suggest to just add the Language_ID and not use the virtual column here for the FormTable.
While the "virtual column" mechanism is kind-of cool, it isn't too well supported by the CLDF toolset (e.g. not i pylexibank
, hence this issue). In particular, I wouldn't rely on many tools being able to infer essential things like reference values (aka foreign keys) from virtual columns - which always needs an additional "resolve-column-value" step.
So my recommendation would be to actually add a non-virtual language ID column.
Okay, I see, thanks for the help! It isn't a big deal to have the column non-virtually after all, I just checked and it's actually only 28KB to spell out the language ID 3500 times in my data frame, so yeah. I just thought that the virtual column was the preferred way to handle these type of data, but if it's not that's no problem in fact. Thanks once more for the help, I appreciate it!
I think virtual columns is in the 20% of the CSVW spec that you shouldn't use, because it's not in the 80% of the spec that tools implement and then claim CSVW compliance :)
Alright, I wasn't aware of that, it's great to know this actually for future work as well 👍
Yeah, I think we should spell this out more explicitly somewhere. The way it is now, it's nothing you could "know" - but rather something I always do when using software/standards: Stick with the 80% functionality implementing the core use cases.
Hi,
I'm on a Windows 10, using Python 3.10, and I'm trying to add a virtual column through my conversion script, since there is only one language in my data. The faq mentions that virtual columns can be added manually, but I'm wondering if it's possible to do so through the conversion script as well. I tried to change line 52 in different ways, which resulted in following errors:
"args.writer.add_language(Language_ID="blob", virtual=True)" -> TypeError: Language.init() got an unexpected keyword argument 'Language_ID' "args.writer.add_language(ID="blob", virtual=True)" -> TypeError: Language.init() got an unexpected keyword argument 'virtual'