Closed evangelos-bitsikas closed 2 months ago
Hello @evangelos-bitsikas,
Thank you for raising the issue. We have added a Preprocessor.ipynb
file, which processes the raw data (also added recently) and produces the cp_corpus_4G.txt
and cp_corpus_5G.txt
. Please pull for the updates. You can run the cells sequentially to get the files.
Note that one whole run of the Preprocessor.ipynb
notebook produces cp_corpus
for only one network type (either 4G
or 5G
). This is set in the second cell of the notebook using the NET_TYPE = '4G'
or NET_TYPE = '5G'
. Thus you need to run the notebook twice overall.
So in short, please change the second cell accordingly before the second run to correctly get both the cp corpora. If you have any questions, please let us know.
@Masfiqur-Mim I appreciate your response.
The script
tokenizer_and_sim_matrix.py
is used to reproduce Figures 3 and 4 from the paper. This script requires the following files:cp_corpus_4G.txt
cp_corpus_5G.txt
However, the methodology or code to produce these files is not provided. Without this crucial information, it is unclear how to verify the results shown in Figures 3 and 4.