Closed tif-calin closed 5 months ago
This is a great issue @tif-calin ! We can do this upstream in JTree/Grammar so CancerDB.com et al will benefit to.
npm run test already has tests to see if files title has valid pldbId. May be additional tests could be added to it ??
We now have "computed measures", so it would be very easy to add a column like "numberOfLanguagesThatCompileToThis" and write a tiny Javascript method to compute it.
I expect we will soon have a lot more data on relationships between languages, such as compilesTo, and then it might be more worth it to add those kind of computed columns. Closing for now.
This is the current frequency count for the
compilesTo
column:One major issue that comes up immediately is how
javascript
andjavascript java php python r ruby scheme
are considered to be two separate answers.Clearly this isn't a useful distinction to make. There's a few possible solutions to this. My least favorite would be to automatically generate a bunch of
compilesTo{lang}
boolean columns. I think a better solution would be to somehow utilize Sets for fields like this. Inclusion in sets is O(1) and we don't have to worry about order or duplicated values.Another column for which these criticisms hold is
country
which current has these freq counts:I don't think using
and
xor oror
as separators is a sustainable long term solution lol