rhysnewell / aviary

A hybrid assembly and MAG recovery pipeline (and more!)
GNU General Public License v3.0
81 stars 12 forks source link

How to run aviary with a batch file #198

Closed s-junguy closed 5 months ago

s-junguy commented 6 months ago

Hi, I want to run aviary with a batch file and tried it with a .tsv and then with a .csv file. In your example .tsv file, the last 2 columns are not separated by but by 2 spaces. I tried to run aviary with a .tsv file that contains columns just separated by and I also tried it with a .tsv file, where the last 2 columns were separated by 2 spaces but none of those options worked. Submitting a .csv file also did not work. So, how can I run the complete workflow of aviary with a batch file? If I submit a .csv file, it only creates a config file. output file:

04/05/2024 09:26:24 PM INFO: Time - 21:26:24 05-04-2024
04/05/2024 09:26:24 PM INFO: Command - /home/user/miniconda3/envs/aviary/bin/aviary batch -f samples.csv -t 70 -n 70 -m 175 --use_megahit
04/05/2024 09:26:24 PM INFO: Version - 0.8.0
04/05/2024 09:26:24 PM INFO: Reading batch file: samples.csv
04/05/2024 09:26:24 PM INFO: Processing P19752-102
04/05/2024 09:26:25 PM WARNING: No assembly provided, assembly will be created using available reads...
04/05/2024 09:26:25 PM INFO: Configuration file written to /home/user/aviary/csvseagrass/P19752-102/config.yaml
04/05/2024 09:26:26 PM INFO: Processing P19752-103
04/05/2024 09:26:26 PM WARNING: No assembly provided, assembly will be created using available reads...
04/05/2024 09:26:26 PM INFO: Configuration file written to /home/user/aviary/csvseagrass/P19752-103/config.yaml
04/05/2024 09:26:26 PM INFO: Processing P19752-104
04/05/2024 09:26:26 PM WARNING: No assembly provided, assembly will be created using available reads...
04/05/2024 09:26:26 PM INFO: Configuration file written to /home/user/aviary/csvseagrass/P19752-104/config.yaml
04/05/2024 09:26:26 PM INFO: Processing P19752-105
04/05/2024 09:26:26 PM WARNING: No assembly provided, assembly will be created using available reads...
04/05/2024 09:26:26 PM INFO: Configuration file written to /home/user/aviary/csvseagrass/P19752-105/config.yaml
04/05/2024 09:26:26 PM INFO: Processing P19752-106
04/05/2024 09:26:26 PM WARNING: No assembly provided, assembly will be created using available reads...
04/05/2024 09:26:26 PM INFO: Configuration file written to /home/user/aviary/csvseagrass/P19752-106/config.yaml
04/05/2024 09:26:26 PM INFO: Beginning clustering of 5 previous Aviary runs with ANI values: [0.99, 0.97, 0.95]...
04/05/2024 09:26:26 PM WARNING: No assembly provided, assembly will be created using available reads...
04/05/2024 09:26:26 PM INFO: Configuration file written to /home/user/aviary/csvseagrass/aviary_cluster_ani_0.99/config.yaml
04/05/2024 09:26:26 PM WARNING: No assembly provided, assembly will be created using available reads...
04/05/2024 09:26:26 PM INFO: Configuration file written to /home/user/aviary/csvseagrass/aviary_cluster_ani_0.97/config.yaml
04/05/2024 09:26:26 PM WARNING: No assembly provided, assembly will be created using available reads...
04/05/2024 09:26:26 PM INFO: Configuration file written to /home/user/aviary/csvseagrass/aviary_cluster_ani_0.95/config.yaml

I use Illumina paired end libraries (10 GB per library). I still use the the old aviary version 0.8.0.

rhysnewell commented 6 months ago

Hello,

Thanks for pointing out the formatting error in the example tsv, I'll fix that up. It should be tabs not spaces in that final column.

Would you please be able to past an example of the csv/tsv file you've been using? What is your reason for using an older version of aviary? Have you tried with a newer version? I'd highly recommend updating to the latest version

rhysnewell commented 6 months ago

I have managed to reproduce this is issue and working on a fix. For now, you can add --write-script aviary_script.sh to your command and it should write all the batch commands to single file which you can then run as a bash script. This should perform the same functions as if running all within aviary

s-junguy commented 6 months ago

Thanks for your help! This is my samples.csv:

sample,short_reads_1,short_reads_2,long_reads,long_read_type,assembly,coassemble
P19752-101,/home/user/all_fastq/P19752_101_S33_L002_R1_001.fastq.gz,/home/user/all_fastq/P19752_101_S33_L002_R2_001.fastq.gz,NA,NA,NA,False
P19752-102,/home/user/all_fastq/P19752_102_S34_L002_R1_001.fastq.gz,/home/user/all_fastq/P19752_102_S34_L002_R2_001.fastq.gz,NA,NA,NA,False
P19752-103,/home/user/all_fastq/P19752_103_S35_L002_R1_001.fastq.gz,/home/user/all_fastq/P19752_103_S35_L002_R2_001.fastq.gz,NA,NA,NA,False
P19752-104,/home/user/all_fastq/P19752_104_S36_L002_R1_001.fastq.gz,/home/user/all_fastq/P19752_104_S36_L002_R2_001.fastq.gz,NA,NA,NA,False
P19752-105,/home/user/all_fastq/P19752_105_S37_L002_R1_001.fastq.gz,/home/user/all_fastq/P19752_105_S37_L002_R2_001.fastq.gz,NA,NA,NA,False
P19752-106,/home/user/all_fastq/P19752_106_S38_L002_R1_001.fastq.gz,/home/user/all_fastq/P19752_106_S38_L002_R2_001.fastq.gz,NA,NA,NA,False

When I was working with the Aviary pipeline, there was only the version 0.8.0 available. I have just created the issue now and then I saw that you have a newer version available so there is no specific reason why I worked with the older version.

rhysnewell commented 5 months ago

Did using the --write-script flag fix your issue here?

s-junguy commented 5 months ago

Yes, it did. Thank you!