Open t-carroll opened 4 years ago
Was able to troubleshoot this further. Looks like in convert_matrix_from_mtx_to_loom.py, there is a block of code:
path_genes = os.path.join(path_input,'features.tsv') genes = pd.read_csv(path_genes, sep='\t', header=None) genes.columns = ['gene_id','gene_id2'] genes = genes.drop(columns = ['gene_id2'])
Where it is expecting 2 columns in the "features.tsv" file. However, all of the relevant features.tsv files generated by the test pigx_scrnaseq command in my environment generate 3 column outputs. Here's an example, but it's the same for all the various features.tsv generated for each sample:
head -3 tests/out/Mapped/WT_HEK_4h_br1/hg19/WT_HEK_4h_br1_Solo.out/Gene/filtered/features.tsv
ENSG00000080573 ENSG00000080573 Gene Expression ENSG00000267650 ENSG00000267650 Gene Expression ENSG00000080511 ENSG00000080511 Gene Expression
I therefore modified the relevant code on my fork's devel branch to the following:
path_genes = os.path.join(path_input,'features.tsv') genes = pd.read_csv(path_genes, sep='\t', header=None) genes.columns = ['gene_id','gene_id2','gene_expression'] genes = genes.drop(columns = ['gene_id2','gene_expression'])
And the test command now completes without error. Do the features.tsv files generated by your guix install on the test command only have two files? If so I imagine I have the wrong version of some dependency (maybe STAR? I have 2.7.5a).
In any case, the HTML report generated by my test command (attached) seemingly indicates I am having further issues with the pipeline, and may be generating incorrect results. For instance, the number of uniquely mapped UMIs is a tiny fraction of the total UMIs, and plots 3.0.2 and 3.0.3 are completely blank. Would it be possible to generate an example report for your test pipeline command (e.g. hg19.scRNA-Seq.report.html generated by your own installation) for users to compare against to insure proper installation?
Thanks for the help! hg19.scRNA-Seq.report_files.zip
Dear Tom,
Thank you so much for reporting the message. I'll look into the error and sort it out.
Could you please just tell me which pandas version you are using?
Best,
Vedran
Hi Vedran,
Thanks for looking into it! It looks like I'm using pandas version 1.0.3:
pip freeze | grep "pandas"
pandas==1.0.3
Hi BIMSB team,
I've been working on getting your pipeline up and running as a non-root user on our group's compute cluster. Unfortunately, installing guix isn't an option in my case, but after some trial and error I was able to put together a conda environment with the listed dependencies that manages to get through the configure/make process.
However, when I go to run the pipeline on the test dataset, the pipeline fails with the following message:
./pigx-scrnaseq tests/sample_sheet.csv -s tests/settings.yaml
I then checked the log referenced in the error message:
cat /mnt/lustre/users/tcarroll/bin/git/pigx_scrnaseq/tests/out/Log/WT_HEK_0h_br1.hg19.convert_matrix_from_mtx_to_loom.log
Looks like some issue with pandas, where there's 2 elements instead of the expected 3? I'm not as experienced with Python so having trouble figuring this out. Perhaps it's some dependency I missed during my manual local install? If you could provide any advice on what might be going on here or where to look for more troubleshooting info it would be most appreciated. I'm running in a CentOS Linux 7 (Core) environment, with tools and packages installed by Conda where possible and manually otherwise.
Thanks for the help! -Tom