ohare93 / brain-brew

Automated Anki flashcard creation and extraction to/from Csv
The Unlicense
89 stars 5 forks source link

issues importing data from csv #31

Closed boydkelly closed 3 years ago

boydkelly commented 3 years ago

Hi, I am getting some errors running the src to ank recipe on new data. have tried to add the data directly to the csv initiallzed under src/data, and also tried to set up a new receipe. My csv data is converted from tsv using a python script. The data looks ok using vim with the csv plugin. But the original tsv file will have embedded commas, extended latin characters and embedded double quotes. And since this is language / prose there is no guarantee that there will not be a stray quote. So I am suspecting that it may be a csv issue. it woujld be on my wishlist to have tsv data source which would at least eliminate any of these comma, quote problems. Otherwise here is a sample error:. I do note the message referring to tags, so I tried a file with and without any tags with the same results. Any ideas on solution appreciated.

[bkelly@toolbox Jula]$ brainbrew run recipes/headwords-5.yaml 
INFO:root:Builder file recipes/headwords-5.yaml is ✔ good
Traceback (most recent call last):
  File "/var/home/bkelly/.local/bin/brainbrew", line 33, in <module>
    sys.exit(load_entry_point('Brain-Brew==0.3.5', 'console_scripts', 'brainbrew')())
  File "/var/home/bkelly/.local/lib/python3.9/site-packages/brain_brew/main.py", line 19, in main
    command.execute()
  File "/var/home/bkelly/.local/lib/python3.9/site-packages/brain_brew/commands/run_recipe/run_recipe.py", line 15, in execute
    recipe = TopLevelBuilder.parse_and_read(self.recipe_file_name, self.verify_only)
  File "/var/home/bkelly/.local/lib/python3.9/site-packages/brain_brew/commands/run_recipe/top_level_builder.py", line 60, in parse_and_read
    return cls.from_list(recipe_data)
  File "/var/home/bkelly/.local/lib/python3.9/site-packages/brain_brew/commands/run_recipe/recipe_builder.py", line 20, in from_list
    tasks = cls.read_tasks(data)
  File "/var/home/bkelly/.local/lib/python3.9/site-packages/brain_brew/commands/run_recipe/recipe_builder.py", line 68, in read_tasks
    task_or_tasks = [matching_task.from_repr(task_arguments)]
  File "/var/home/bkelly/.local/lib/python3.9/site-packages/brain_brew/commands/run_recipe/parts_builder.py", line 40, in from_repr
    return cls.from_list(data)
  File "/var/home/bkelly/.local/lib/python3.9/site-packages/brain_brew/commands/run_recipe/recipe_builder.py", line 20, in from_list
    tasks = cls.read_tasks(data)
  File "/var/home/bkelly/.local/lib/python3.9/site-packages/brain_brew/commands/run_recipe/recipe_builder.py", line 73, in read_tasks
    inner_task.execute()
  File "/var/home/bkelly/.local/lib/python3.9/site-packages/brain_brew/build_tasks/csvs/notes_from_csvs.py", line 74, in execute
    notes_part: List[Note] = [self.csv_row_to_note(row, self.note_model_mappings) for row in csv_rows]
  File "/var/home/bkelly/.local/lib/python3.9/site-packages/brain_brew/build_tasks/csvs/notes_from_csvs.py", line 74, in <listcomp>
    notes_part: List[Note] = [self.csv_row_to_note(row, self.note_model_mappings) for row in csv_rows]
  File "/var/home/bkelly/.local/lib/python3.9/site-packages/brain_brew/build_tasks/csvs/notes_from_csvs.py", line 87, in csv_row_to_note
    tags = split_tags(filtered_fields.pop("tags"))
  File "/var/home/bkelly/.local/lib/python3.9/site-packages/brain_brew/utils.py", line 83, in split_tags
    split = [entry.strip() for entry in re.split(r';\s*|,\s*|\s+', tags_value)]
  File "/usr/lib64/python3.9/re.py", line 231, in split
    return _compile(pattern, flags).split(string, maxsplit)
TypeError: expected string or bytes-like object

Single sample row being imported:

6084b28d-1134-4ecf-bc77-ec8ff4e5ba36,fɔ, ,,dire; jouer (d'un instrument),,,,,,,,,,,, ,,,,,,,,,Musa k'a fɔ n ye.,Moussa me l'a dit.,Musa k'a fɔ n ye ko Fanta be na.,Musa m'a dit que Fanta viendra.,mɔgɔw tun be dundun fɔla.,les gens jouaient du tam-tam.,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,, ,
ohare93 commented 3 years ago

Sorry this is happening. I can' see the issue at a glance from your example row here :thinking: Do you have tags in the headers of the csv? In fact, may I have an example file, with the headers and just a few lines? Thanks :handshake:

boydkelly commented 3 years ago

My bad. Closing this. At least the benefit for me is that brainbrew revealed a bug in my xsl transform. Just a couple of records out of 600 had an issue that was causing this. Since I am working on another 2000 or so, glad to have identified and fixed the issue. Thanks!!