Open poojaparameswaran99 opened 1 week ago
Hello @poojaparameswaran99, and thanks for your interest in AlphaPulldown!
To generate features for multiple sequences, you could either save each individual sequence in a separate fasta file and provide the comma-separated list of files as --fasta_paths=Q8C0M9.fasta,Q6NXK8.fasta,...
or save all the sequences in a single fasta file.
In your case, I recommend the second option. Then, you could generate the features with the command:
sbatch run_create_individual_features.sh --array=1-9000 (with the slurm script like usual and --fasta_paths=<your_file.fasta> (the rest of the flags)
Hi @DimaMolod thank you for your response! That is how I have it now, with the format as:
>accession1
AASEQUENCEASFOLLOWS1
>accession2
AASEQUENCEASFOLLOWS2
But for some reason not all of the accessions are being saved, only a small handful are. In any case I will try again. I do not specify the --array
parameter, so that may be the issue, I will look into it and follow up in event of complications.
I noticed you indicate to run a shell script .sh
, I am running the .py
file directly. Perhaps this could be the issue?
Thank you!
I am attempting to parse in a fasta file for the argument
--fasta_paths
with the following format:I have about 9k items like this, but when I run create_individual_features.py to create the monomer outputs, only a minuscule 20 items are getting parsed and output. Why is this so? Is there a maximum amt of AA allowed for each accession? Is the format incorrect?
Thank you in advance!