Closed aga-relation closed 2 years ago
Hi,
Thank you for your question.
The files with matching names can be accessed from the CodeOcean capsule we shared with our publication: https://codeocean.com/capsule/8020974/tree/v1 . For instance, /data/Glu/_teX.h5
corresponds to the complex media training data, and /data/Glu/_vaX.h5
corresponds to the complex media validation data.
The full data (used to generate these splits) and high quality random test data can also be accessed in the data repository here:
Good luck!
Great, thank you! Could you please clarify which value in the Random_testdata_defined_media.csv
corresponds to gene expression?
The meanEL
column corresponds to expression.
Hi, I am trying to replicate the data splits quoted in the paper and having issues.
There is no seed in the data processing script hence no way to replicate your train/valid/test splits. Instead, I am trying to connect the files mentioned in the processing script with the files in the data repository but there are no matching file names and the README doesn't explain it either. Could you please specify which data files were used for training, validation, evaluating on random sequences and evaluating on naturally-occurring sequences?
Thank you! :)