Closed stephaniereinders closed 1 day ago
Turns out that the error resulted from several blank documents in the data set. The blank documents don't have any graphs when processed so get_clusters_batch
throws an error when trying to access the non-existent graphs. I added fixed processDocument
to throw an error if a document doesn't contain any graphs, and get_clusters_batch
now returns a warning message. These changes are in #190.
I processed all 1,604 CLV documents with
process_batch_dir()
and then ranget_clusters_batch()
on the processed documents.get_clusters_batch()
resulted in an error:Error` in { : task 1022 failed - "subscript out of bounds"