Zymo-Research / figaro

An efficient and objective tool for optimizing microbiome rRNA gene trimming parameters
GNU General Public License v3.0
80 stars 25 forks source link

Forward reads appear to not be of consistent length #38

Open BirongZhang opened 3 years ago

BirongZhang commented 3 years ago

Hi figaro team,

Thanks for developing such a wonderful tool!

But I have some problem when I use it. I have 22 paired data(515FB and 926R ), each one is close to 300bp. I used figaro before trimming the primers.

  1. python3 $FIGARO_HOME/figaro.py -i data2 -o figaro -a 300 -f 19 -r 20 -F illumina

    Screenshot 2021-06-09 at 17 32 11
  2. Then I use FASTX-Toolkit to trim them at the same length: for i in data2/*.fastq do fastx_trimmer -f 5 -l 295 -i $i -o data3/$i done

  3. python3 $FIGARO_HOME/figaro.py -i data3/data2 -o figaro -a 300 -f 19 -r 20 -F illumina

    Screenshot 2021-06-09 at 17 33 40

So what is the problem with my script? I would appreciate any suggestions, thanks!

Kind regards, Birong

michael-weinstein commented 3 years ago

This looks interesting. I think FIGARO is detecting the reads as still having some variability in length… maybe you had a few short ones in there. Try trimming down to 290 and let me know what happens.

From: BirongZhang @.> Sent: Wednesday, June 9, 2021 9:39 AM To: Zymo-Research/figaro @.> Cc: Subscribed @.***> Subject: [Zymo-Research/figaro] Forward reads appear to not be of consistent length (#38)

Hi figaro team,

Thanks for developing such a wonderful tool!

But I have some problem when I use it. I have 22 paired data(515FB and 926R ), each one is close to 300bp. I used figaro before trimming the primers.

  1. python3 $FIGARO_HOME/figaro.py -i data2 -o figaro -a 300 -f 19 -r 20 -F illumina

    https://user-images.githubusercontent.com/74430395/121393820-a0f26300-c948-11eb-8952-baa362aec310.png

  2. Then I use FASTX-Toolkit to trim them at the same length: for i in data2/*.fastq do fastx_trimmer -f 5 -l 295 -i $i -o data3/$i done

  3. python3 $FIGARO_HOME/figaro.py -i data3/data2 -o figaro -a 300 -f 19 -r 20 -F illumina

    https://user-images.githubusercontent.com/74430395/121394014-d5feb580-c948-11eb-8346-f16fc77de670.png

So what is the problem with my script? I would appreciate any suggestions, thanks!

Kind regards, Birong

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/Zymo-Research/figaro/issues/38 , or unsubscribe https://github.com/notifications/unsubscribe-auth/ACEYNLOMFLBA4O6OZWTPPADTR6KJTANCNFSM46MMY37Q . https://github.com/notifications/beacon/ACEYNLMLSFU7PUBB2TPQRFDTR6KJTA5CNFSM46MMY372YY3PNVWWK3TUL52HS4DFUVEXG43VMWVGG33NNVSW45C7NFSM4NU7GTUQ.gif

BirongZhang commented 3 years ago

Hi Michael,

Thanks for your efficient reply and kind help!

Yes, you are right. I tried to trim them down to 290, and this time the above error did not occur. However, there is a new error.

  1. for i in data2/*.fastq do fastx_trimmer -f 5 -l 290 -i $i -o data3/$i done
  2. python3 $FIGARO_HOME/figaro.py -i data3/data2 -o figaro -a 290 -f 14 -r 10 -F illumina
Screenshot 2021-06-09 at 22 29 37

I am looking forward to hearing form you, many thanks!

Birong

michael-weinstein commented 3 years ago

How long is your amplicon? Is it only 300 bases?

Sent from my iPhone

On Jun 9, 2021, at 2:31 PM, BirongZhang @.***> wrote:

 Hi Michael,

Thanks for your efficient reply and kind help!

Yes, you are right. I tried to trim them down to 290, and this time the above error did not occur. However, there is a new error.

for i in data2/*.fastq do fastx_trimmer -f 5 -l 290 -i $i -o data3/$i done python3 $FIGARO_HOME/figaro.py -i data3/data2 -o figaro -a 290 -f 14 -r 10 -F illumina

I am looking forward to hearing form you, many thanks!

Birong

— You are receiving this because you commented. Reply to this email directly, view it on GitHub, or unsubscribe.

BirongZhang commented 3 years ago

Yes, each one is close to 300bp. Here is the quality plot, hope it helps. Thanks!

Screenshot 2021-06-10 at 00 47 22
LiaOb21 commented 2 years ago

Hi Michael, I started from a practically equal situation to the one reported by BirongZhang. I decided to trim the reads because FIGARO doesn't support variable lengths. I was still having errors, so I followed your suggestion of trimming up to 290. The first error ("reads appear to not be of consistent length") seems to be solved, but I run into the same last error that BirongZhang reported:

ValueError: setting an array element with a sequence. The requested array has an inhomogeneous shape after 1 dimensions. The detected shape was (29971,) + inhomogeneous part.

Any idea on how solve that? Thank you so much in advance!

jpmgzmn commented 10 months ago

Hi all! Thanks to this thread, the issue about variable sequence lengths was resolved. However, I've encountered exactly the same issue just now about ValueError. Specifically, I got this error too:

ValueError: setting an array element with a sequence.

Has anyone resolved this since? Thank you so much!