Open ArtisF opened 2 years ago
Dates are reported as they appear in the original source, but there are sometimes weird cases like this. Sometimes the webpage just says something like "published 2 hours ago" and the script has to infer the timestamp based on the current time and day. In other cases, this can be explained by differences between time zones (it is currently 22:00 3/16 in Vladivostok, but 8:00 3/16 on the east coast of the US, where the data are being collected). In other cases, the original source makes a mistake. Thankfully, these represent only a tiny fraction of overall events, but filtering them by time/day or using a 2-day temporal filter could help.
Are the python (scrapy?) scrapers available? The data cleanup pipeline? The rnn models and nlp processing?
Dear sbuser,
Currently, this is only a data repository, not a code repository. But I do intend to make the py and R scripts available, along with the training data.
Best, Yuri
Yuri M. Zhukov Associate Professor of Political Science Research Associate Professor, Center for Political Studies Institute for Social Research University of Michigan Email: @.*** Website: http://sites.lsa.umich.edu/zhukov
нд, 24 квіт. 2022 р. о 20:40 sbuser @.***> пише:
Are the python (scrapy?) scrapers available? The data cleanup pipeline? The rnn models and nlp processing?
— Reply to this email directly, view it on GitHub https://github.com/zhukovyuri/VIINA/issues/2#issuecomment-1107953933, or unsubscribe https://github.com/notifications/unsubscribe-auth/AGO2BSUL6TRKYY5WJJA26KDVGXSZRANCNFSM5Q3TS7XA . You are receiving this because you commented.Message ID: @.***>
First of all thank you very much for launching this project. I find the trend data very interesting to follow. I am not sure, if it is a bug, but I noticed that events_20220315144728.csv has dates in the future i.e. post the file publication date (see snapshot below). Is this a bug or something that we need to adjust for?