Closed cifuj closed 2 years ago
Yes I just hit this error as well when trying to reproduce my previous results.
Can you check if these folder ./fq_15000/*/ have contigs.fasta? Or can you rerun it (I think the insert size of your reads is 300): python ~/StrainXpress/scripts/strainxpress.py -fq 5383_B.fq -fast -t 48 -average_read_len 150 -insert_size 300
Same error and the folder fq_15000/ is empty.
Same error and the folder fq_15000/ is empty.
Do it have any folder under fq_15000? Do it have any fq fild? Could you present some content of your fq file (for example read names )? I think maybe these reads don't be clustered.
There is a fq_15000 folder, but nothing inside. These are the files I have obtained so far. all.contigs_15000.fasta (empty) all_reads_sort.map cmd_overlap.sh out.txt readnames.txt (empty)
These are the 50 first read names. reads_names_20220407.txt
There is a fq_15000 folder, but nothing inside. These are the files I have obtained so far. all.contigs_15000.fasta (empty) all_reads_sort.map cmd_overlap.sh out.txt readnames.txt (empty)
These are the 50 first read names. reads_names_20220407.txt
Yes. Because reads don't be clustered and strainxpress cannot assembly reads (fq_15000 is empty). Now, I know the problem. In the cluster step, strainxpress cannot identify the read names. Could you change your reads name like below: previously: @VH00578:5:AAATVMCM5:1:1102:73125:16884 1:N:0:ACGGACTT+TTGATCCG @VH00578:5:AAATVMCM5:1:1102:73125:16884 2:N:0:ACGGACTT+TTGATCCG
rename: @VH00578_73125_16884/1 @VH00578_73125_16884/2
thx
advice on how to reformat?
advice on how to reformat?
Do you instal Perl? Take an example of the above format, you can directly transfer it: perl -ane'if(/^\@/){@a = split/\:/; $b = (split/\s/,$a[-4])[-1]; print"$a[0]$a[-5]$b\n";}else{print}' fq_file > new_fq_file
Maybe here are also other approaches.
perl -ane'if(/^@/){@A = split/:/; $b = (split/\s/,$a[-4])[-1]; print"$a[0]$a[-5]$b\n";}else{print}' fq_file > new_fq_file
didn't work, this completely erased all of the headers
how have you been generating the fastq files for input to strainxpress?
perl -ane'if(/^@/){@A = split/:/; $b = (split/\s/,$a[-4])[-1]; print"$a[0]$a[-5]$b\n";}else{print}' fq_file > new_fq_file
didn't work, this completely erased all of the headers
how have you been generating the fastq files for input to strainxpress?
You need to modify the code base on read name of your fq file. Can you let me have a look read name in your fq file?
You need to modify the code base on read name of your fq file. Can you let me have a look read name in your fq file? thank you! here: @A00201R:332:H2LYWDRXY:1:1101:13819:35446 1:N:0:TCTATCCTAA+GAGAGGTTCG
out of curiosity, how do you typically generate the intereleaved fastq that will end up going to strainxpress? is it from samtools fastq, or another pipeline?
You need to modify the code base on read name of your fq file. Can you let me have a look read name in your fq file? thank you! here: @A00201R:332:H2LYWDRXY:1:1101:13819:35446 1:N:0:TCTATCCTAA+GAGAGGTTCG
Hope it work. Previously, I made some mistakes in codes. perl -ane'if(/^@/){@a = split/\:/; $b = (split/\s/,$a[-4])[-1]; print"$a[0]$a[-5]/$b\n";}else{print}' fq_file > new_fq_file
I found github change my type, the a after @ must be lower case @a is incorrect
out of curiosity, how do you typically generate the intereleaved fastq that will end up going to strainxpress? is it from samtools fastq, or another pipeline?
My fq file that download from sequencing machine is like below. I don't convert it with any other tools.
@S0R0/1 @S0R0/2
[image: image.png]
didn't seem to work again :/
On Fri, Jul 15, 2022 at 5:48 PM kangxiongbin @.***> wrote:
You need to modify the code base on read name of your fq file. Can you let me have a look read name in your fq file? thank you! here: @A00201R:332:H2LYWDRXY:1:1101:13819:35446 1:N:0:TCTATCCTAA+GAGAGGTTCG
Hope it work. Previously, I made some mistakes in codes. perl -ane'if(/^@@.*** https://github.com/A = split/:/; $b = (split/\s/,$a[-4])[-1]; print"$a[0]$a[-5]/$b\n";}else{print}' fq_file > new_fq_file
out of curiosity, how do you typically generate the intereleaved fastq that will end up going to strainxpress? is it from samtools fastq, or another pipeline?
My fq file that download from sequencing machine is like below. I don't convert it with any other tools.
@S0R0/1 @S0R0/2
— Reply to this email directly, view it on GitHub https://github.com/kangxiongbin/StrainXpress/issues/3#issuecomment-1185971258, or unsubscribe https://github.com/notifications/unsubscribe-auth/AIM35ZDYY6IMZB63V2Y65ADVUHMBNANCNFSM5SYV53FA . You are receiving this because you commented.Message ID: @.***>
-- Connor Brown Graduate Research Assistant with Helm Lab https://www.biochem.vt.edu/people/faculty/helm-richard.html and Pruden Lab https://www.pruden.cee.vt.edu/ Department of Genetics, Bioinformatics, and Computational Biology Virginia Tech
that did it!! thank you!!
that did it!! thank you!!
Good!
Hi,
I have an error while running StrainXpress. I just got the last version (Update pipeline_per_stage.v3.py) and I'm using the following command:
$ python ~/StrainXpress/scripts/strainxpress.py -fq 5383_B.fq -fast -t 48 -average_read_len 150
And I got the following error:pid 67085's current affinity mask: ffffffffffffffffffff pid 67085's new affinity mask: ff begin... ################################################## the 1/1 part start... this is the: 0 for 100w lines this is the: 1000000 for 100w lines this is the: 2000000 for 100w lines this is the: 3000000 for 100w lines this is the: 4000000 for 100w lines this is the: 5000000 for 100w lines this is the: 6000000 for 100w lines this is the: 7000000 for 100w lines this is the: 8000000 for 100w lines the 1/1 part finished...
################################################## cat: cmd_polyte.sh: No such file or directory successfully execute: split 5383_B.fq -l 1057832 -d -a 2 sub successfully execute: cat cmd_overlap.sh | xargs -i -P 48 bash -c "{}"; successfully execute: cat sub.map > all_reads_sort.map successfully execute: rm sub; successfully execute: python /home/StrainXpress/scripts/get_readnames.py 5383_B.fq readnames.txt successfully execute: python /home/StrainXpress/scripts/bin_pointer_limited_filechunks_shortpath.py all_reads_sort.map readnames.txt 15000 strainxpress 48 successfully execute: python /home/StrainXpress/scripts/getclusters.py strainxpress_max15000_final 48 successfully execute: python /home/StrainXpress/scripts/get_fq_cluster.py strainxpress_max15000_final_clusters_grouped.json 5383_B.fq /scratch/tmp.1268209/reads/fq_15000 successfully execute: rm -rf Chunkfile; rm strainxpress_max15000_final_clustersizes.json strainxpress_max15000_final_clusters_unchained.json strainxpress_max15000_final_clusters.json successfully execute: cat cmd_polyte.sh | xargs -i -P 48 bash -c "{}"; Traceback (most recent call last): File "/home/StrainXpress/scripts/strainxpress.py", line 171, in
sys.exit(main())
File "/home/StrainXpress/scripts/strainxpress.py", line 109, in main
execute(cmd_merge_contigs)
File "/home/StrainXpress/scripts/strainxpress.py", line 161, in execute
with open("output.txt","r") as file:
FileNotFoundError: [Errno 2] No such file or directory: 'output.txt'