dereneaton / ipyrad

Interactive assembly and analysis of RAD-seq data sets
http://ipyrad.readthedocs.io
GNU General Public License v3.0
70 stars 39 forks source link

ipyrad tries to find assembly file instead of reading parameters file #563

Open d-caraballo opened 3 weeks ago

d-caraballo commented 3 weeks ago

Hi, I am currently facing an issue with my ipyrad assembly process and would appreciate your assistance.

I have imported demultiplexed and quality-filtered sequences from STACKS using process_radtags. Initially, I attempted a preliminary assembly in ipyrad, which terminated due to running out of time. Now, I am attempting to proceed by executing each assembly step separately, specifically from steps 3 to 8. The outputs from steps 1 and 2 are already provided from STACKS.

I have the following files prepared:

Parameters File: I have configured a parameters file (params-orotettix.txt) containing the necessary settings for ipyrad. Barcodes: Although included, I understand these are not utilized in the current ipyrad setup. Population File: I have a population assignment file (popfile.tsv) prepared and referenced in my parameters file. However, when I attempt to execute the batch file for step 3, I encounter the following error message:

ipyrad.assemble.utils.IPyradError: Could not find saved Assembly file (.json) in expected location. Checks in: [project_dir]/[assembly_name].json Checked: /home/dcaraballo/ipyrad/orotettix.json

It seems that ipyrad is searching for a previous JSON assembly file (orotettix.json). While I copied and pasted the previous JSON file, created in the initial run, ipyrad seems to ignore the updated information specified in the current parameters file.

Could you please advise on how I can completely restart the ipyrad assembly process? I would like to provide the updated parameters file along with the demultiplexed (and filtered) sequences and the population file to start afresh.

Best,

Diego

isaacovercast commented 3 weeks ago

Hello Diego,

I think I see what's happening. Even though you already demux'd the data and did the QC with stacks, you still need to run steps 1 and 2 in ipyrad. If you set the sorted_fastq_path parameter to point to the demultiplexed samples and run step 1 then ipyrad simply reads the data in the samples to get counts of raw reads per sample. If you have already QC'd the data, running step 2 should be fast and shouldn't change much the files.

The ipyrad workflow requires starting from step 1 even if you have pre-demultiplexed samples and so on, it's just a requirement of the design (it makes keeping track of things much much easier).

Try that and let us know how it goes. -isaac