uvacw / inca

24 stars 6 forks source link

index error in cosine analysis #507

Closed damian0604 closed 4 years ago

damian0604 commented 4 years ago

I think this is sth we fixed before in the softcosine analysis, but it's not fixed in the standard cosine analysis...

INFO:INCA:Removing all tokens that occur in less than 2 documents or in more than 50.0% or all documents from dictionary
INFO:INCA:Preparing tfidf model
INFO:INCA:Performing sliding window comparisons...
Traceback (most recent call last):
  File "./025-cosine_newsevents.py", line 11, in <module>
    myinca.analysis.cosine_similarity.fit(source=['nu', 'ad (www)', 'volkskrant (www)'], target=['nu', 'ad (www)', 'volkskrant (www)'], sourcetext='softcosine_processed', targettext='softcosine_processed', from_time='2018-11-26', to_time='2019-05-26', days_before=0, days_after=2, merge_weekend=True, filter_below=2, destination='/mnt/elastic/cosine_output/')
  File "/usr/local/lib/python3.6/dist-packages/inca/analysis/cosine_analysis.py", line 197, in fit
    if group[0]['_source'][sourcedate].weekday()==6:
IndexError: list index out of range