uvacw / inca

24 stars 6 forks source link

importer fix #392

Closed FeLoe closed 6 years ago

FeLoe commented 6 years ago

Importer function should now work without supplying a doctype

FeLoe commented 6 years ago

Now all the importers should work for files as well as folders..

damian0604 commented 6 years ago

Perfect, thanks! Maybe @mariekevh can test it on Wednesday?

mariekevh commented 6 years ago

Yes, I will look at it this Wednesday :)

FeLoe commented 6 years ago

I checked again everything and had a couple of things that caused some problems:

If this is desired behaviour it should be noted somewhere, but I think it is a little strange that one can only export one file with headers..

But: If I only select the non-HTML fields and have headers the import function is working just fine for me 😉

damian0604 commented 6 years ago

Thanks, @FeLoe . I think HTML export should be optional, by far most users won't need/want it (those who do probably export to JSON anyway).

Regarding the headers. I was not aware of it, but I actually find it desirable as it is, because it allows concatening files without having headers in between:

cat output1.csv output2.csv > everthinginonelargefile.csv

or

cat output*.csv >everythinginonelargefile.csv

But you are right that this should be noted somewhere and/or be optional.

FeLoe commented 6 years ago

Fixed the last issues with the exporter (it now is not exporting images and only exporting HTML if necessary). Now the exported documents can be imported again. + The telegraaf scraper is now also fixed (had some issues with the titles..)

damian0604 commented 6 years ago

Maybe @mariekevh can have a final look and check? Then I'll merge (and resolve the conflicts that seem to be there)

mariekevh commented 6 years ago

@damian0604 Actually sitting next to Felicia :)

mariekevh commented 6 years ago

Works fine! @damian0604 The conflict arises because while Felicia was working on this, I solved the no headers issue in export_csv in PR #391