voyanttools / VoyantServer

GNU General Public License v3.0
68 stars 8 forks source link

Terms Export current data as tab separated text does not include document index #13

Closed pbstudent closed 2 years ago

pbstudent commented 2 years ago

Perhaps a data enhancement request. Terms export missing trend index document numbering: Term Count Trend oer 845 0.018336777,0.016155088,0.030612245,0.011122346,0.021685114,0.023415977,0.020710059,0.017254902,0.024087317,0.024046285,0.0012515645,0.0114722755,0.0057024364,0.022937147,0.020300752,0.013664597,0.008674102,0.02770083,0.0056956876,0.028128032,0.004830918,0.013972056,0.034993272,0.040642723,0.025763359 university 505 0.011363637,0,0,0.006066734,0.011068444,0.016528925,0,0.01764706,0.0030109147,0.009220756,0.0037546933,0.00095602294,0,0.013404826,0.012781955,0.014492754,0.01858736,0.022160664,0.02034174,0.02327837,0.025120772,0.015968064,0.01076716,0.004725898,0.009541985 open 485 0.004132231,0.011308562,0.010204081,0.003033367,0.0036141856,0.012396694,0.020710059,0.012549019,0.014301844,0.011209547,0.026282854,0.008604206,0.011923276,0.02204349,0.012781955,0.012008281,0.01858736,0.024930747,0.012205045,0.007759457,0.0028985508,0.010479042,0.00538358,0.014177694,0.013358778 resources 351 0.004132231,0.01453958,0.010204081,0.012133468,0.0031624124,0.008953168,0.0147929,0.0050980393,0.00865638,0.007231965,0.02503129,0.018164435,0.00466563,0.0026809652,0.008270676,0.0070393374,0.021065675,0.011080332,0.017900731,0.009214355,0.016425122,0.005988024,0.00538358,0.007561437,0.016221374 policy 337 0.005165289,0.0064620357,0.030612245,0.008088979,0.004969505,0.006198347,0.01183432,0.0070588235,0.010538201,0.013740734,0.012515645,0.0038240917,0.0020736132,0.00953232,0.0030075188,0.004968944,0.006195787,0.002770083,0.009764036,0.0111542195,0,0.003992016,0.01345895,0.012287335,0.006679389

ajmacdonald commented 2 years ago

The Terms tool displays term data at the corpus scale and therefore doesn't have document index data. If you'd like to work with terms at the document scale, use the Document Terms tool instead.

pbstudent commented 2 years ago

The trend relative frequency count is attached to each specific document. Rather than guessing which document relates to which values, values need to have labels. Hence, the index redundant or not not, correctly labels these values, whereas the general label "Trend" obscures more than it reveals. If consistency of data identification is important then all columns need to have first row labels, applicable across data sets generated from same software. I know that I would not be able to submit these data sets without correct labelling. Hence, the program either produces the correct labels or I have to enter them manually (which is counter-productive). Currently, I am manually adding labels to columns for the data to make sense to readers. Thank you for considering this data presentation difference.

ajmacdonald commented 2 years ago

Ok I think I understand what you mean now. I've created an entry for it here: https://github.com/voyanttools/Voyant/issues/14