Closed chrzyki closed 4 years ago
This should be captured on the level of the lexibank.makecldf already, as there are quite a few datasets where we have Source for a language and then just use it to make the source for the form. In fact, I reckon these are 50% of all datasets with sources from multiple references.
I think this requires a bit more debugging. Which table is causing the problems? And which column exactly is about to be added? And does the db already hold other datasets? Supposedly, this line https://github.com/lexibank/pylexibank/blob/cd08b999271a58abc89d88e091b320418dffc9f4/src/pylexibank/db.py#L365 should guard against the issue. Why doesn't it?
Can be reproduced with:
pylexibank==2.7.1
)pip install -e git+https://github.com/lexibank/birchallchapacuran.git#egg=lexibank_birchallchapacuran
pip install -e git+https://github.com/lexibank/abvd.git#egg=lexibank_abvd
Then:
(lexibanksqlite) ~$ cldfbench lexibank.load _ --glottolog ~/Repositories/glottolog/glottolog --concepticon ~/Repositories/concepticon/concepticon-data/
abvd loads successfully, then:
Dataset "birchallchapacuran" at .virtualenvs/lexibanksqlite/src/lexibank-birchallchapacuran
Traceback (most recent call last):
File ".virtualenvs/lexibanksqlite/bin/cldfbench", line 8, in <module>
sys.exit(main())
File ".virtualenvs/lexibanksqlite/lib/python3.8/site-packages/cldfbench/__main__.py", line 78, in main
return args.main(args) or 0
File ".virtualenvs/lexibanksqlite/lib/python3.8/site-packages/pylexibank/commands/load.py", line 17, in run
with_datasets(args, db.load)
File ".virtualenvs/lexibanksqlite/lib/python3.8/site-packages/cldfbench/cli_util.py", line 90, in with_datasets
res.append(with_dataset(args, func, dataset=ds))
File ".virtualenvs/lexibanksqlite/lib/python3.8/site-packages/cldfbench/cli_util.py", line 82, in with_dataset
res = func(*arg, args)
File ".virtualenvs/lexibanksqlite/lib/python3.8/site-packages/pylexibank/db.py", line 367, in load
conn.execute(
sqlite3.OperationalError: duplicate column name: Source
Could the not-capitalized source
in abvd's LanguageTable be problematic?
Changing from source
to Source
in abvd's LanguageTable fixes this for me.
Hmm, the abvd provider has lots of lowercase fields so I suspect there might be more clashes..
So - considering that SQL is case insensitive - the check in https://github.com/lexibank/pylexibank/blob/cd08b999271a58abc89d88e091b320418dffc9f4/src/pylexibank/db.py#L365 should be case insensitive, too.
E.g. https://github.com/lexibank/birchallchapacuran defines sources for languages and forms. Calling
cldfbench lexibank.load
for a data set like this results inbecause sources are supposed to be written only 'once' per data set for the Lexibank sqlite.db?