Closed Erythroxylum closed 8 months ago
Currently it's not possible. It has been on my mind to add for some time though. It should be pretty straightforward, so I'll take a crack at it today.
Awesome! That would be easier than downloading everything. Thanks.
@Erythroxylum, I added this on the branch local_and_sra
. Working on the PR #146 now, but you can checkout that branch and test it out. Let me know if you run into issues.
Hi @cademirch, the pipeline ran until sort_gatherVcfs. I ran the main branch with the same test batch of fastqs and without the sra sample and snparcher completed. The err file and log files are attached. Here is the log error:
Checking the headers and starting positions of 142 files [E::bgzf_flush] File write failed (wrong size) [buf_flush] Error: cannot write to /tmp/00114.bcf Cleaning
Thanks for your help! sratest20_log.txt sraerr19.txt
@Erythroxylum I'm not sure about this, but I think this looks like a space issue in the tmp dir? Tagging @tsackton since he's familiar with this cluster.
@Erythroxylum is big temp set to something like "./tmp" in your config? if not that is an easy place to start.
Hello, big temp was not set to anything, so I set as above but now there is an error:
Traceback (most recent call last):
File "/n/holyscratch01/davis_lab/dwhite/snpArcher-shared-testsra-fastq/snpArcher/./profiles/slurm/slurm-submit.py", line 7, in
Did changing the /tmp cause this? run_pipeline.sh and the log file are attached. Replicating the commands in run_pipeline.sh within an interactive session, snakemake --version is 7.28.3 sraerr-nomodulesnakemake.txt run_pipeline.sh.txt
I suspect this is an issue with conflicting Python versions somewhere. Not sure why it wasn't a problem before - can you check what version of python you have in your Snakemake environment, and also what the default Python version is (e.g. after module load python but before you activate the Snakemake environment)?
module load python python --version
Python 3.10.12 conda activate snakemake python --version
Python 3.11.4
On Wed, Jan 3, 2024 at 8:44 AM Tim Sackton @.***> wrote:
I suspect this is an issue with conflicting Python versions somewhere. Not sure why it wasn't a problem before - can you check what version of python you have in your Snakemake environment, and also what the default Python version is (e.g. after module load python but before you activate the Snakemake environment)?
— Reply to this email directly, view it on GitHub https://github.com/harvardinformatics/snpArcher/issues/145#issuecomment-1875393294, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABLR5LXU3S7D2PHBVHSCNETYMVOENAVCNFSM6AAAAABAWUM7NWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQNZVGM4TGMRZGQ . You are receiving this because you were mentioned.Message ID: @.***>
I don't fully understand why this error is occurring for you now, only after changing the big temp line in the config. My best guess is this is related, somehow, to using the python cluster module, but I am not really sure.
I would suggest that the easiest thing to try is to start with a fresh install of mamba, using miniforge3: https://github.com/conda-forge/miniforge (use Miniforge3-Linux-x86_64). Then, you might need to delete your existing conda environments. You can recreate your snpArcher environment with mamba create -c conda-forge -c bioconda -n snparcher snakemake=7.32.4
(note snpArcher doesn't work with snakemake 8.0 yet).
Then you should ideally be able to just have the first line of your run_pipeline.sh script be mamba activate snakemake
and remove the module load and the existing conda stuff.
I'm not positive that will work, because this is a strange error, but it feels like some kind of python version thing which is likely a conda issue, so this is probably the best troubleshooting place to start. Please keep us posted if you have any issues.
Fresh install of mamba seems to have worked. @cademirch I went to clone the branch again and it said the branch has been removed. What is the status now?
git clone -b local_and_sra --single-branch https://github.com/harvardinformatics/snpArcher.git
Cloning into 'snpArcher'... warning: Could not find remote branch local_and_sra to clone. fatal: Remote branch local_and_sra not found in upstream origin
We merged that branch a few weeks ago. You should be fine just using main
OK, well you can close this then. Thanks so much for your prompt attention!
Hello, I have some samples from the SRA that I would like to include along with the local fastqs. I have downloaded the metadata sample sheet from the SRA as directed but I am curious how to combine it with the samplesheet.csv created for the local files? I have attached a version of the sample sheet where I just included the SRA metadata that corresponds with the samplesheet columns created with your python script, so the fq1 and fq2 columns for the SRA are 'NaN'. Should this work? test_SRA_samplesheet.csv Thanks for the help!