YeoLab / clipper

A tool to identify CLIP-seq peaks
Other
64 stars 41 forks source link

Invalid contig error #93

Open jalalsiddiqui opened 3 years ago

jalalsiddiqui commented 3 years ago

Traceback (most recent call last): File "/users/PAS1143/osu8165/.conda/envs/clipper3/lib/python3.7/multiprocessing/pool.py", line 121, in worker result = (True, func(*args, **kwds)) File "/users/PAS1143/osu8165/.conda/envs/clipper3/lib/python3.7/site-packages/clipper/src/call_peak.py", line 932, in call_peaks subset_reads = list(bam_fileobj.fetch(reference=str(interval.chrom), start=interval.start, end=interval.stop)) File "pysam/libcalignmentfile.pyx", line 1081, in pysam.libcalignmentfile.AlignmentFile.fetch File "pysam/libchtslib.pyx", line 686, in pysam.libchtslib.HTSFile.parse_region ValueError: invalid contig chr1_KI270708v1_random """ The above exception was the direct cause of the following exception: Traceback (most recent call last): File "/users/PAS1143/osu8165/.conda/envs/clipper3/bin/clipper", line 8, in sys.exit(call_main()) File "/users/PAS1143/osu8165/.conda/envs/clipper3/lib/python3.7/site-packages/clipper/src/main.py", line 266, in call_main main(options) File "/users/PAS1143/osu8165/.conda/envs/clipper3/lib/python3.7/site-packages/clipper/src/main.py", line 105, in main peaks_dicts.append(job.get(timeout=options.timeout)) File "/users/PAS1143/osu8165/.conda/envs/clipper3/lib/python3.7/multiprocessing/pool.py", line 657, in get raise self._value ValueError: invalid contig chr1_KI270708v1_random

byee4 commented 3 years ago

random contigs were likely trimmed from the clipper references - if it encounters a region that doesn't exist in the reference it might complain. You can see which chromosomes are included for each supported species (eg you can get a unique list of valid chromosomes from hg19 gencode v19 (hg19_exons.bed) using some bash:

link to clipper data/regions

awk -F "\t" '{print $1}' hg19_exons.bed | uniq

jalalsiddiqui commented 3 years ago

Thank you so much!

I will look into this.

Jalal


From: Brian Yee notifications@github.com Sent: Friday, January 22, 2021 5:35 PM To: YeoLab/clipper clipper@noreply.github.com Cc: Jalal K. Siddiqui siddiqui.13@osu.edu; Author author@noreply.github.com Subject: Re: [YeoLab/clipper] Invalid contig error (#93)

CAUTION: External Email

random contigs were likely trimmed from the clipper references - if it encounters a region that doesn't exist in the reference it might complain. You can see which chromosomes are included for each supported species (eg you can get a unique list of valid chromosomes from hg19 gencode v19 (hg19_exons.bed) using some bash:

link to clipper data/regionshttps://urldefense.com/v3/__https://github.com/YeoLab/clipper/tree/master/clipper/data/regions__;!!KGKeukY!i75ETm_4UTpK_2-5zIyLGXSrLXvM1LXckRx_L6aWs1UEQoameqwZXbencRyV8ug70is$

awk -F "\t" '{print $1}' hg19_exons.bed | uniq

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://urldefense.com/v3/__https://github.com/YeoLab/clipper/issues/93*issuecomment-765724780__;Iw!!KGKeukY!i75ETm_4UTpK_2-5zIyLGXSrLXvM1LXckRx_L6aWs1UEQoameqwZXbencRyVsxegWAo$, or unsubscribehttps://urldefense.com/v3/__https://github.com/notifications/unsubscribe-auth/ABZLZDHNOYGSSXAQEVHO6E3S3H4R5ANCNFSM4WPD3XLQ__;!!KGKeukY!i75ETm_4UTpK_2-5zIyLGXSrLXvM1LXckRx_L6aWs1UEQoameqwZXbencRyV5pXm4XQ$.

jalalsiddiqui commented 3 years ago

I ran this for around ~10 hours but I had no output or anything occur. Do you know what the problem might be?

byee4 commented 3 years ago

I have seen Clipper run for longer than that, we allow 24 hours per run although the vast majority of jobs don't take that long.

jalalsiddiqui commented 3 years ago

Thanks. Is there a way to make this run faster?

Jalal


From: Brian Yee notifications@github.com Sent: Tuesday, January 26, 2021 4:45 PM To: YeoLab/clipper clipper@noreply.github.com Cc: Jalal K. Siddiqui siddiqui.13@osu.edu; Author author@noreply.github.com Subject: Re: [YeoLab/clipper] Invalid contig error (#93)

CAUTION: External Email

I have seen Clipper run for longer than that, we allow 24 hours per run although the vast majority of jobs don't take that long.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://urldefense.com/v3/__https://github.com/YeoLab/clipper/issues/93*issuecomment-767847762__;Iw!!KGKeukY!hvtwfkwum5oukM5731dbt__3_2sNFHF_JvFNGRD-af3EGxmiKc4U-H4zdFiwVZ0bBRY$, or unsubscribehttps://urldefense.com/v3/__https://github.com/notifications/unsubscribe-auth/ABZLZDBCMYATHUJHGGN27GLS34ZXVANCNFSM4WPD3XLQ__;!!KGKeukY!hvtwfkwum5oukM5731dbt__3_2sNFHF_JvFNGRD-af3EGxmiKc4U-H4zdFiw0lHLQ4I$.

byee4 commented 3 years ago

I've had some success running it on an ec2 instance with more cores, c3.8xlarge for instance might speed it up some.

jalalsiddiqui commented 3 years ago

I use cluster computing. I can try to increase the number of threads.

Jalal


From: Brian Yee notifications@github.com Sent: Tuesday, January 26, 2021 6:48 PM To: YeoLab/clipper clipper@noreply.github.com Cc: Jalal K. Siddiqui siddiqui.13@osu.edu; Author author@noreply.github.com Subject: Re: [YeoLab/clipper] Invalid contig error (#93)

CAUTION: External Email

I've had some success running it on an ec2 instance with more cores, c3.8xlarge for instance might speed it up some.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://urldefense.com/v3/__https://github.com/YeoLab/clipper/issues/93*issuecomment-767904998__;Iw!!KGKeukY!lY_WaOo_esLKbCBcDuNw64bbskyK1d8gc_4HRLM4h7c8otomIgmdxPHQD2ciMif70dY$, or unsubscribehttps://urldefense.com/v3/__https://github.com/notifications/unsubscribe-auth/ABZLZDHVVJRDD7VSCKI4PX3S35IDLANCNFSM4WPD3XLQ__;!!KGKeukY!lY_WaOo_esLKbCBcDuNw64bbskyK1d8gc_4HRLM4h7c8otomIgmdxPHQD2ci9zyD5wo$.

jalalsiddiqui commented 3 years ago

I am running the pipeline with 6 threads at the moment. Note I am doing a single end analysis with the first read only.

jalalsiddiqui commented 3 years ago

One more question. Is there a way to monitor progress. There is no output so I am not sure of the progress I am making on the peak calling.