langmead-lab / monorail-external

examples to run monorail externally
MIT License
13 stars 5 forks source link

Incomplete Error in new version of Unify(ver. 1.1.0) with Mouse dataset #21

Closed Sungryong-Oh closed 1 year ago

Sungryong-Oh commented 1 year ago

Hi,

I tried to re-analyze local data with Monorail-unify version 1.1.0, but I meet an incomplete error at a final step of unify.

Could you please help me about new version of Unify with Mouse dataset? I renewed every annotation files, scripts, and Singularity images. After that, I confirmed that human dataset works well with new version of Unify(1.1.0), but mouse dataset did not. And I also checked that my Pump outputs of mouse dataset(with new annotations) are working well with Unify version 1.0.9.

Thank you.

Sungryong-Oh commented 1 year ago

I meet this problem only with mouse data. When I check the result, it seems to be failed to generate one merged metadata output folder as a last step of Unify. Would you please check it?

ChristopherWilks commented 1 year ago

Hi @Sungryong-Oh,

Thanks for your patience and thanks for the bug report, looks like this was due to a bug in the additional sums check that was added in 1.1.0 which was hardcoded to human, causing non-human orgs to fail (e.g. mouse).

I've updated the unifier to 1.1.1, as long as you retrieved all the reference files at the time of 1.1.0, you should be good to go with just updating the Unifier image to 1.1.1 and re-running it on your mouse data.

Sungryong-Oh commented 1 year ago

Dear Christopher,

Thanks for your update. I ran again with unified 1.1.1, and it works perfect!

I have one more quick question. If I want to run monorail pipeline with single-end data, should I give input only once to Pump command? Or twice repeatedly? I'm confused because the Pump running command is only written in paired-end data format.

ChristopherWilks commented 1 year ago

Hi @Sungryong-Oh,

You should be able to run with just the one FASTQ file as the final parameter to run_recount_pump.sh. However, this does mean the study_id will be set to LOCAL_STUDY due to the limitations of that script.

Using the example command from the README, it'd just be:

/bin/bash run_recount_pump.sh /path/to/recount-pump-singularity.simg SRR390728 local hg38 20 /path/to/references /path/to/SRR390728_1.fastq.gz