millanp95 / DeLUCS

This repository contains all the source files required to run DeLUCS, a deep learning clustering algorithm for DNA sequences.
24 stars 11 forks source link

build_DP bug #2

Closed FSS-PHL closed 3 years ago

FSS-PHL commented 3 years ago

HI,

I'm trying to test your tool on some COVID seqs downloaded from gisaid. I put 500 seqs in fasta format in a folder called 'fas', I got the error 'File name too long' so I just named them 1-500, but this error persists.

Just to be clear; I have a folder called fas. In it are 500 fastas, called 1.fa, 2,fa etc.

They are formatted as such:

head fas/1.fa

>1.B_1
ACTTTCGATCTCTTGTAGATCTGTTCTCTAAACGAACTTTAAAATCTGTGTGGCTGTCAC
TCGGCTGCATGCTTAGTGCACTCACGCAGTATAATTAATAACTAATTACTGTCGTTGACA

I ran them as below (on ubuntu with Python 3.8.5) and got the below error:


build_dp.py --data_path = fas 
/home/binfie1/binfiebin/DeLUCS/src/build_dp.py: line 12: 
This script builds a dataset in pickle
format from a folder with FASTA files. The
desired label of the file must be in the file ID
after the accession number separated by a dot.

:param dataset: Name of the Dataset.
:param data_path: Path of the folder with the sequences.
:returns: None

Example: python build_dp.py --data_path = '../data/Influenza'
: File name too long
/home/binfie1/binfiebin/DeLUCS/src/build_dp.py: line 14: import: command not found
from: can't read /var/mail/Bio
/home/binfie1/binfiebin/DeLUCS/src/build_dp.py: line 16: import: command not found
/home/binfie1/binfiebin/DeLUCS/src/build_dp.py: line 17: import: command not found
/home/binfie1/binfiebin/DeLUCS/src/build_dp.py: line 20: syntax error near unexpected token `('
/home/binfie1/binfiebin/DeLUCS/src/build_dp.py: line 20: `def replace(seq):'

It seems as though there are multiple import errors.

Any help would be appreciated, Liam

FSS-PHL commented 3 years ago

I think you just forgot the shebang #!/usr/bin/python3