Closed sbyim closed 3 years ago
These are the names used in the AMC corpus data. As we agreed at the tool meeting in October, we are going to have just a few sources, and 9 ones that are chosen for now are not abbreviated.
What are the 9 sources? I've just imported the data as you've generated and shared, no filtering is done on the tool side yet.
I can't find anything regarding the sources in the meeting protocol, could you post it here? If that's the case, is it possible to generate networks only for those corpus/subcorpus of interest?
Initially i was asked to use all sources for the tool. I proposed not to keep most of them (which are of a small size and little relevance) so that we can afford more target words. But we then decided to keep them for the trial so that Tanja and Andreas would choose the sources based on the analysis. The November trial didn't happen, and i was told it would not be possible to conduct it using the tool in the near future. Therefore, we are trying to estimate the parameters without the tool. Once the basic parameters are finalised, i will compute the data for 2000 words which could be used for the tool development.
Thx for the updates, clearly i wasn't informed about these decisions and imported all of the November data into the tool, but the work should have been done anyway and I will filter out the abbreviated data on the tool side for now.
ego networks will be generated for selected subcorpora without abbreviation only
It's hard to guess what the actual media source is.