shalgilab / DoGFinder

5 stars 10 forks source link

Get_loci_annotation issue #1

Closed sjroth closed 5 years ago

sjroth commented 5 years ago

To whom it may concern,

I am getting an error when running Get_loci_annotation. My gtf file is GENCODE V28 for hg38. Here is the error:

File "/gpfs/data01/bennerlab/home/sjroth/software/DoGFinder/Get_loci_annotation", line 87, in df_merge_annot.columns=['chrom','start','end','strand','name']; File "/gpfs/data01/bennerlab/home/sjroth/software/miniconda3/envs/py2/lib/python2.7/site-packages/pandas/core/generic.py", line 4389, in setattr return object.setattr(self, name, value) File "pandas/_libs/properties.pyx", line 69, in pandas._libs.properties.AxisProperty.set File "/gpfs/data01/bennerlab/home/sjroth/software/miniconda3/envs/py2/lib/python2.7/site-packages/pandas/core/generic.py", line 646, in _set_axis self._data.set_axis(axis, labels) File "/gpfs/data01/bennerlab/home/sjroth/software/miniconda3/envs/py2/lib/python2.7/site-packages/pandas/core/internals.py", line 3323, in set_axis 'values have {new} elements'.format(old=old_len, new=new_len)) ValueError: Length mismatch: Expected axis has 4 elements, new values have 5 elements

StellamarisSoares commented 5 years ago

I am having a same error when I try run Get_loci_annotation. I am using the gtf file from GENCODE M20 GRCm38.p6 (gene annotation of mouse).

sjroth commented 5 years ago

For me it was a versioning issue, try bedtools-2.25.0

StellamarisSoares commented 5 years ago

Thank you so much, @sjroth That was my issue too.

sjroth commented 5 years ago

No problem :) I had a tough time getting it to work so you can message me with any questions.

Romicak commented 5 years ago

Hi, did anyone try running this pipeline with non-stranded RNA-seq data? I am having problems running the pipeline with non-stranded data. It is having errors in step 2. I opened a new issue regarding this, but just wanted to check if any of you also came across similar problems. Thanks.

StellamarisSoares commented 5 years ago

Hi, Romicak! I ran DoGFinder with non-stranded RNA-seq data, but I did not have that problem. My data is non-stranded and is single ended, and I've seen that your data is paired. Unfortunately, I don't know how to help you.

Romicak commented 5 years ago

Hi Stella, Thank you for replying. It's interesting that the pipeline is working with single end, non-stranded RNA-seq data. Yes, my data is paired ended and not strand specific. I have another set of data that is paired ended, but strand specific and the pipeline works with that- I get no errors in the pre-process step. Therefore, I was thinking whether there is something missing in the "init.py" file where it checks for data being paired ended and strandedness. When the strand is "none", perhaps it does not copy the reads from the old bam file into the new "PE" appended bam files that it creates- and that is why those new bam files are empty for me.

sagarutturkar commented 2 years ago

@Romicak I am facing exactly the same issue. Please Let me know if you have found workaround for this.