Bookworm-project / BookwormDB

Tools for text tokenization and encoding
MIT License
84 stars 12 forks source link

options.json files returns incomplete year range #56

Open tpmccallum opened 9 years ago

tpmccallum commented 9 years ago

Hi, I have a weird issue, the year 2011 is not showing in my graph < http://54.153.174.142/mccallum/ > The json.catalog.txt file has plenty of entries for 2011 eg

{"date": 2011, "uni": "uow", "searchstring": "<a href=\"http://www.uow.edu.au/content/groups/public/@web/@gov/documents/doc/uow125352.pdf\" target=\"_blank\">2011 document from uow</a>", "filename": "_var_www_html_mccallum_bookworm_transform_2011_uow_uow125352_2011_uow"}

The catalog.txt file also has plenty of entries

1       _var_www_html_mccallum_bookworm_transform_2011_uow_uow125352_2011_uow   <a href="http://www.uow.edu.au/content/groups/public/@web/@gov/documents/doc/uow125352.pdf" target="_blank">2011 document from uow</a>  uow     2011

The fastcat table seems to have 2011 entries eg

 26 |   4976 |      23 |      2011 |

Appreciate any suggestions on diagnosing. Kind regards Tim

bmschmidt commented 9 years ago

Hmm, this seems to be caused by some poor decision making on the part of the algorithm that determines what years are in the database (many bookworms have a lot of year 0 dates in them, and it tries to trim those out).

This is bad behavior and should remain as a bug--but you should be able to fix it manually by opening the options.json file in your web directory, which looks like this: and replacing the occurrences of 2012 with 2011.

{"ui_components": [{"type": "text", "dbfield": "word", "name": "Word(s)"}, {"categorical": {"sort_order": [], "descriptions": {}}, "type": "categorical", "dbfield": "uni__id", "name": "uni"}, {"name": "date_year", "initial": [2012, 2023], "range": [2012, 2023], "type": "time", "dbfield": "date_year", "unit": "date_year"}], "default_search": [{"counttype": "Occurrences_per_Million_Words", "smoothingSpan": 0, "words_collation": "Case_Sensitive", "time_measure": "date_year", "search_limits": [{"word": ["test"]}]}], "settings": {"sourceName": "mccallum", "sourceURL": "mccallum", "itemName": " text", "dbname": "mccallum"}}
tpmccallum commented 9 years ago

Hi Ben, Thank you, I updated the options.json file and achieved the result I was after. Chat soon Tim