fhalab / evSeq

Computational tools for extremely low-cost, massively parallel amplicon-based sequencing of every variant in protein mutant libraries.
https://fhalab.github.io/evSeq/
Other
29 stars 9 forks source link

Data Processing error #38

Open MaximilianoRosadio opened 11 months ago

MaximilianoRosadio commented 11 months ago

It's able to load and qc the libraries but after that it sends me this error: Unhandled exception encountered: ''NoneType' object has no attribute 'sort_values''

palmhjell commented 11 months ago

@MaximilianoRosadio can you provide more info? At what stage does it fail? My gut says that the provide refseq and other arguments may be wrong and not allowing the sequences to align correctly, returning an empty dataframe that gives that error when we attempt to sort it. Please send additional info from the log file if you can if this doesn't help.

MaximilianoRosadio commented 11 months ago

We're encountering issues post-QC. We've requested a 'parsing' of the reads, but the output is coming up empty. We have a suspicion that the reads might not be meeting the QC criteria, but we're uncertain about how to confirm this. Our current hypothesis is that the system may be encountering difficulty in assigning the sequences to the respective wells.

palmhjell commented 11 months ago

the output is coming up empty

Does this mean that an output folder is created according to the --output argument given, but no files are populated in here? Or no output folder is generated at all? The quality histograms should be created and placed in the output folder as long as QC is completed. I've never seen an instance where the output folder is not made (it's done immediately when the program runs) so make sure it's not being placed somewhere you're not expecting.

See https://fhalab.github.io/evSeq/5-outputs.html

Once you located the quality histograms and they look good/bad, it will help troubleshoot this going forwards. This doesn't explain your original exception, but it's a start.

If the histograms don't explain the issue (and even if they do), make sure you've run the example data as described here: https://fhalab.github.io/evSeq/4-usage.html#post-installation. This way we know if it's a program issue or a one specific to your data.