UNCG-CSE / Library-Computer-Usage-Analysis

The University Libraries at UNCG currently track the state of a computer, determining whether or not a particular computer is in use. This data is compiled into a database, and a web app pulls from this database to show a map and number of available computers. As of Fall 2017, the data had not been used to determine which computers are used more frequently, aside from counting the number of times a computer transitions into/away from the 'in-use' state. This project attempts to correlate the usage of these computers with various factors, including: campus scheduling, equipment configuration, placement, population in the library, and area weather. Using this data, this project also uses machine learning to determine the best placement of computers for future allocation, and possible reconfiguration of equipment and space.
1 stars 1 forks source link

Add all data to repo #28

Closed smindinvern closed 6 years ago

smindinvern commented 6 years ago

Specifically it looks like there is only one day's worth of machine usage data.

brownworth commented 6 years ago

That is correct. When I uploaded the file initially, it seemed to make more sense to work with 24 hours of data until we get our algorithms in place. I can choose any range of data we would like, and go from there.

smindinvern commented 6 years ago

Action item: @brownworth to add entire library dataset to the repo.

brownworth commented 6 years ago

Per our conversation in class, I will get the following:

PatriciaTanzer commented 6 years ago

Ty @brownworth

brownworth commented 6 years ago

The full logon data has been added in a separate directory called LibCSV. There is a file size limit of 25MB, so the file had to be broken up. It is in 29 files of about 100k lines per. There are no headers, as it may make sense to only import from one point.

brownworth commented 6 years ago

I just added the gate counts through 10/10/17. There are some missing data points (assumption: library is closed). There are also some places where the count rolls over from 1M. Hopefully, since the calculations are done based upon a daily delta, this will not be an issue.

brownworth commented 6 years ago

I have submitted a request for climatological data for 7/1/10 - 10/6/17 (the latest). I will upload it when I receive the email letting me know that the file is ready.

brownworth commented 6 years ago

The file was completed, but only had data through 12/31/16. Resubmitting the request.

brownworth commented 6 years ago

The resubmitted request provided the same data. I will update the file and see if a separate request will return a file that can be concatenated.

brownworth commented 6 years ago

Requesting 1/1/17 - 10/6/17 (latest) returns an empty set: image

brownworth commented 6 years ago

The two files will need to be concatenated: 1101311.csv 2010-07-01 00:23 - 2016-12-31 18:54 1052640.csv 2017-01-01 00:54 - 2017-08-22 23:54

PatriciaTanzer commented 6 years ago

Ok, good to know. Thanks for getting that data, @brownworth !

smindinvern commented 6 years ago

Sweet, thanks! Is this all of the data that we need in the repo?

brownworth commented 6 years ago

It really depends on if we're looking to do any calculations beyond 8/22/17. I would advocate not going beyond that.

smindinvern commented 6 years ago

Ok, I'll close this issue then. Thanks for getting this done.