Question relating to the data processing steps (Need Help :(

Hi :), I have read the paper and it is a really beautiful work. I have a school project to train a neural network that can predict the origin of herpes viruses. I am to use just the DNA sequences from NCBI for this. I understood the build-up of your neural network in your paper but I did not understand how you processed your sequences to get data you could feed to your network. My idea concerning my project is to align the sequences say with a tool like MUSCLE and send it to a numpy array and then encode the nucleotides to numeric values and feed my neural network. I do not know if this plan of mine is good. Could you please explain to me how you processed your data to obtain neural-network feedable data? And with your expert opinion does my method sound reasonable or could you please suggest a better approach if any? Thanks in advance

kr-colab / locator

Question relating to the data processing steps (Need Help :( #5