rishavray / PILFER

piRNA cluster prediction and analysis framework
8 stars 0 forks source link

Pilfer Job killed. #2

Open vivekruhela opened 5 years ago

vivekruhela commented 5 years ago

Hello,

I tried to get a piRNA cluster using your script pilfer.py. For a gzipped fastq file of size 573.0 Mb, I have got bam file from hg19 alignment using bowtie of size 2.9 Gb and bed file of size 9.2 Gb after subtracting it with ncRNA.bed. When I tried the following command (python27) vivekr@sarabhai:~$ python /home/vivekr/PILFER/tools/pilfer.py -i /home/vivekr/piRNA/SRR7541164.bed > /home/vivekr/piRNA/SRR7541164.pirna_cluster

Killed

After very long pause, the job was killed. Does very heavy bed file responsible for this?

vivekruhela commented 5 years ago

Hi,

I would to add little update here. As I have mentioned earlier that job was killed probabely due to heavy baed file or less memory available [not sure]. So I tried with small bed file but I am still not getting any response. Here is the error:

(python27) vivekr@sarabhai:~$ head -n 100000 /home/vivekr/piRNA/SRR7541164.bed > /home/vivekr/piRNA/SRR7541164_1.bed ; python /home/vivekr/PILFER/tools/pilfer.py -i /home/vivekr/piRNA/SRR7541164_1.bed > /home/vivekr/piRNA/SRR7541164.pirna_cluster /home/vivekr/PILFER/tools/pilfer.py:63: RuntimeWarning: invalid value encountered in double_scalars if (row[4] - mean)/sd >= sd_factor:

Please suggest.

LliliansCalvo commented 4 years ago

Hi could you solve this ? I am encountering a similar problem.

rishavray commented 4 years ago

@LliliansCalvo This issue would mainly occur if your BED file is too large. The pipeline assumes that you have collapsed your BED file and the count is added up in field 5. What is the size of your BED file?

Senimene commented 1 year ago

Hello, I'm encountering same problem :

/PILFER-master/tools/pilfer.py:63: RuntimeWarning: invalid value encountered in double_scalars if (row[4] - mean)/sd >= sd_factor:

The size of my bed file is ~1,63 Go It contains 22556341 lines and it looks linke this : Contig923 62546 62570 HWI-ST225:358:C0BKDACXX:5:1205:12920:21054_1 255 + Contig559 49825 49853 HWI-ST225:358:C0BKDACXX:5:1205:12956:21174_1 255 - Contig282 108956 108983 HWI-ST225:358:C0BKDACXX:5:1205:13564:21023_1 255 + Contig579 1774320 1774347 HWI-ST225:358:C0BKDACXX:5:1205:13614:21048_1 255 + ...

Could you help me please to resolve this problem ? Thank you in advance

Best regards IS

rishavray commented 1 year ago

Your BED file is not properly formatted for the pipeline. The reads need to be collapsed and the count should appear in column 5. Please use the scripts provided in the repo, that should help resolve the issue.

Senimene commented 1 year ago

Hi Rishav, Noted. I will correct the format Thanks for your help :) Best Regards Imène