lexibank / lsi

CLDF dataset derived from Grierson's "Linguistic Survey of India" from 1928
https://lsi.clld.org
Creative Commons Attribution 4.0 International
1 stars 0 forks source link

Unresolved conflict in cldf metadata #20

Closed xrotwang closed 4 years ago

xrotwang commented 4 years ago

This won't fly: https://github.com/lexibank/lsi/blob/a14200cdaffdf5159cbb5c2c9e2b7b651ac97fae/cldf/cldf-metadata.json#L17-L37

xrotwang commented 4 years ago

cldf/languages.csv has conflict marker as well.

xrotwang commented 4 years ago

Trying to run cldfbench lexibank.makecldf on my end, I get:

Traceback (most recent call last):
  File "/home/forkel/venvs/lsi/bin/cldfbench", line 8, in <module>
    sys.exit(main())
  File "/home/forkel/venvs/lsi/lib/python3.5/site-packages/cldfbench/__main__.py", line 78, in main
    return args.main(args) or 0
  File "/home/forkel/venvs/lsi/lib/python3.5/site-packages/pylexibank/commands/makecldf.py", line 23, in run
    with_dataset(args, 'makecldf', dataset=dataset)
  File "/home/forkel/venvs/lsi/lib/python3.5/site-packages/cldfbench/cli_util.py", line 82, in with_dataset
    res = func(*arg, args)
  File "/home/forkel/venvs/lsi/lib/python3.5/site-packages/pylexibank/dataset.py", line 217, in _cmd_makecldf
    super()._cmd_makecldf(args)
  File "/home/forkel/venvs/lsi/lib/python3.5/site-packages/cldfbench/dataset.py", line 211, in _cmd_makecldf
    self.cmd_makecldf(args)
  File "./lexibank_lsi.py", line 80, in cmd_makecldf
    with open(f) as this_file:
TypeError: invalid file: DataDir('raw/LSI_txt/1-21/10-11 Five.txt')
LinguList commented 4 years ago

Okay, @phylostar, it seems you need to check for unresolved conflicts. I think they were introduced when you updated parts in the digitization. You should just grep for >>> and make sure we have the correct version.

LinguList commented 4 years ago

But it runs from here, yet I have python 3.8. Is it possible that 3.5 has stricter regulations?

LinguList commented 4 years ago

@xrotwang, should I just update the cldf?

PhyloStar commented 4 years ago

I am using Python 3.6.9. Didn't have any issues.

LinguList commented 4 years ago

I just pushed the new cldf code without conflicts and the like (I hope). If the file is corrupted, I think it is the encoding (?).

xrotwang commented 4 years ago

There was a file missing in raw. Probably not added and pushed?

Johann-Mattis List notifications@github.com schrieb am Do., 21. Mai 2020, 16:25:

I just pushed the new cldf code without conflicts and the like (I hope). If the file is corrupted, I think it is the encoding (?).

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/lexibank/lsi/issues/20#issuecomment-632115820, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAGUOKF6EJU5HJPIKIJATADRSU2V3ANCNFSM4NGAYU3Q .

LinguList commented 4 years ago

Strange. Is it there now?

xrotwang commented 4 years ago

It seems to have been a python 3.5 problem. Using a path with space in it doesn't work with open(p), it does like this, though:

$ git diff lexibank_lsi.py
diff --git a/lexibank_lsi.py b/lexibank_lsi.py
index 0d15662..3e0f423 100644
--- a/lexibank_lsi.py
+++ b/lexibank_lsi.py
@@ -77,7 +77,7 @@ class Dataset(BaseDataset):
             current_language = ''
             concept = f.name[:-4]
             args.log.info('Parsing {0}'.format(concept))
-            with open(f) as this_file:
+            with f.open() as this_file:
                 data = this_file.readlines()
                 for line in data:
                     line = unicodedata.normalize('NFD', line)
PhyloStar commented 4 years ago

@xrotwang Is the issue resolved? I pulled the latest version and could rerun the makecldf command without any issue.