hitaandrea / MGcount

MGcount github repository
GNU General Public License v3.0
14 stars 9 forks source link

cannot run with C elegans gtf #2

Closed tliontis closed 2 years ago

tliontis commented 2 years ago

Hi,

I am having difficulty running the program. I am running: MGcount --gtf /_path_/Caenorhabditis_elegans.WBcel235.custom.gtf --bam_infiles /_path_/MGcount_bamfiles.txt --outdir MGcount -s 1 There is an error at the initial phase and then it looks like featureCounts is called repeatedly, and the program keeps running with failure: error1

error2

error3

error text: " sh: 1: Primers/RNAseq/mgcount_tutorial/MGcount/.mg_4dc7m91w/dotced_1.clean_Aremoved_counts_small.csv: not found sh: 1: Primers/RNAseq/mgcount_tutorial/MGcount/.mg_4dc7m91w/small_annot.gtf: not found " + " READ ASSIGNATION FAILED. Please check that:

  1. The gtf contains the required non-empty fields as defined by the assignation arguments "feature", "feature_output" & "feature_biotype2. featureCounts is available on the path specified by --featureCounts_path argument (default: /user/bin/featureCounts)
  2. Input bam files are not empty or corrupted "

After troubleshooting for a while I am not sure what to do. I generated the BAM files with ShortStack and have validated their results with other counting programs like QuasR and DESEQ2 so the BAM files are unlikely to be corrupted. Furthermore, I'm using the custom C elegans gtf file provided by MGcount. As for featureCounts, it is executable and clearly MGcount is able to call it. I can also run featureCounts by typing it in the command line (it is in /usr/bin/).

Furthermore, in the output directory, there is a copy of my .bam files and a "short" and "long" .gff-looking file.

hitaandrea commented 2 years ago

Hi,

Let me try to reproduce your error. It looks like there is a problem with the gtf small non-coding annotations extraction during the first assignation round.

When you see this "small" .gtf-looking file in your output directory, you see this in the first level of the folder or you see this inside a temporary subdirectory named ".mg_4dc7m91w" as specified by the error message: Primers/RNAseq/mgcount_tutorial/MGcount/.mg_4dc7m91w/small_annot.gtf: not found ?

Also, have you tried to run the program as a python module? If so, are you getting the exact same error?

tliontis commented 2 years ago

Thank you for helping me with this. The "small_annot.gtf" and "long_annot.gtf" as well as a copy of my bam files get created inside a temporary directory .mg file. I ran the code through the python module and got similar errors: error4 The file name in "Primers/RNAseq/MGcount/.mg_6lpu82qy/small_annot.gtf: not found" seems to be caused by a space in my absolute path, because this same .mg_6lpu82qy/small_annot.gtf exists, but the path is longer and does not begin at "Primers". Using a path without spaces resolves this error, but does not resolve the second error: "sh: 1: Primers/RNAseq/MGcount/.mg_49u7c5bv/ced_1.clean_Aremoved_counts_small.csv: not found"

I tried running the code with relative paths (since I don't have any spaces starting at /RNAseq/...) but this leads to the same kind of error because I believe the file with bam file paths (bam_files.txt) is expecting absolute paths.

Finally, I tried escaping the space symbols with \ in the bam_files.txt, but this leads to a different error, probably related to character parsing within python: error5

I imagine the errors can be solved by using an absolute path with no spaces. I can try this but it will be a bit of work to make sure my other codes don't break!

tliontis commented 2 years ago

Making sure that paths in --ourdir argument and in the input bam file .txt file have NO SPACES seems to resolve the problem! I recommend including this topic as a Bug until resolved. For example, I cannot use my cloud storage because it has an obligatory space in its absolute path.

hitaandrea commented 2 years ago

Many thanks for spotting the issue with spaces on paths. I am glad you could run the program. Looking forward to resolve this in the coming days ;)

hitaandrea commented 2 years ago

Hi @tliontis, I uploaded a new maintenance release addressing this issue. Please, let me know if you encounter any problem.