Closed npch closed 7 years ago
Totally agree with this comment. Maybe you want an overall percentage of all jobs, but it will be really really low.
Olivier and I took a look at the jobs.ac.uk website and their classifications. It looks like around 30% of jobs fall into a non-research role category (admin, finance, etc.). There's no point, as Neil says, in classifying them, so let's knock 'em out, concentrate our dataset and increase our hit rate.
@Oliph is now working on re-structuring the data in the "job type" field so that we can see how many different job families exist (e.g. research, admin, etc), and count how many jobs in each one. We'll then select which jobs families should make up the study and include only those in the classifier.
Done - closing.
It looks like the dataset has a large number of jobs which are not research or research related in it (maintenance jobs, administrators) which means that many of the jobs that I classified appeared to not be research related or software related.
Is there a way of sorting the dataset to present a better mix of jobs than just random picking?