I think this is sth we fixed before in the softcosine analysis, but it's not fixed in the standard cosine analysis...
INFO:INCA:Removing all tokens that occur in less than 2 documents or in more than 50.0% or all documents from dictionary
INFO:INCA:Preparing tfidf model
INFO:INCA:Performing sliding window comparisons...
Traceback (most recent call last):
File "./025-cosine_newsevents.py", line 11, in <module>
myinca.analysis.cosine_similarity.fit(source=['nu', 'ad (www)', 'volkskrant (www)'], target=['nu', 'ad (www)', 'volkskrant (www)'], sourcetext='softcosine_processed', targettext='softcosine_processed', from_time='2018-11-26', to_time='2019-05-26', days_before=0, days_after=2, merge_weekend=True, filter_below=2, destination='/mnt/elastic/cosine_output/')
File "/usr/local/lib/python3.6/dist-packages/inca/analysis/cosine_analysis.py", line 197, in fit
if group[0]['_source'][sourcedate].weekday()==6:
IndexError: list index out of range
I think this is sth we fixed before in the softcosine analysis, but it's not fixed in the standard cosine analysis...