bailey-lab / MIPTools

A suite of computational tools used for molecular inversion probe design, data processing, and analysis.
https://miptools.readthedocs.io
MIT License
6 stars 9 forks source link

Failure to assign replicates #42

Open arisp99 opened 2 years ago

arisp99 commented 2 years ago

Bug Description

Following the fixes in #22 where the sample sheet preparation was updated, I have been unable to generate a sample sheet without running into errors. Namely, the error occurs when the script attempts to assign a replicate number to each grouped sample and sample set here:

https://github.com/bailey-lab/MIPTools/blob/8e3ed00dcdbbc71d12039d47f0039bbdc77c14f5/src/sample_sheet_prep.py#L133-L135

Digging into the code a bit more, the new replicate column is filled with NaN. Somehow, the assign_replicate function is generating NaN instead of the true replicate number. This causes the astype(int) command to fail as it cannot coerse NaN into an integer.

Steps to Reproduce

  1. Build or download the development version of the container.
  2. Isolate a capture plate and sample plate from a sequencing run.
  3. Run:
    $ singularity exec -B /work/apascha1/sample-sheet-prep:/opt/analysis \
      $miptools_dev.sif python /opt/src/sample_sheet_prep.py \
      --capture-plates capture_plates.tsv --sample-plates sample_plates.tsv --output-file samples.tsv

This should raise the following exception:

Exception: Error in assigning replicates. Please make sure the 'sample_name' and 'sample_set' fields have valid, non-empty values in all provided files.

Expected Behavior

There should be no NaNs in the replicate column after assigning replicates.


@aydemiro Do you have any idea what might be going wrong in the assign_replicate function?