Closed katieemelianova closed 1 year ago
Hi Katie,
@hxin knows this script a lot better than I do, but one thing it looks like to me is that I think that the --samples-origin
parameter should be a space-separated list, equal to the number of samples, of the species that the samples contain, rather than the sample name itself – so, because you've only got the one sample in your TSV file, it should just be the name of the species ("incarnata" or "fuchsii") that SRR3330397 contains. Could you possibly give that a go and see what happens?
Hiya,
I gave it a go using two ways of supplying the species name:
sargasso_parameter_test rnaseq --samples-origin incarnata --mismatch-setting '0 2 4' --minmatch-setting '0 2 4' --multimap-setting '1' --plot-format png incarnata_sample.tsv incarnata_test incarnata /scratch/botany/katie/orchid/ParentalRNAseq/incarnata/incarnata_db fuchsii /scratch/botany/katie/orchid/ParentalRNAseq/fuchsii/fuchsii_db
and
(sargasso) [emelianova@login01 parameter_test]$ sargasso_parameter_test rnaseq --samples-origin "incarnata" --mismatch-setting '0 2 4' --minmatch-setting '0 2 4' --multimap-setting '1' --plot-format png incarnata_sample.tsv incarnata_test incarnata /scratch/botany/katie/orchid/ParentalRNAseq/incarnata/incarnata_db fuchsii /scratch/botany/katie/orchid/ParentalRNAseq/fuchsii/fuchsii_db
And both still give me the error:
Error: number of sample does not equal to number of sample origin.
I will carry on trying to figure it out but if you have any other ideas I can try those too :)
Best,
Katie
Ok, that is weird - as far as I can tell from having a look at the code, that check is just counting the number of items in the "samples-origin" list, and checking that it's equal to the number of lines in the sample TSV. Is there any chance that there's anything like an extra empty line in the sample TSV file?
No, there was one before and I thought I had cracked it but no luck unfortunately! The tsv file looks like this with non newlines after the first one:
SRR3330397 /scratch/botany/katie/orchid/ParentalRNAseq/incarnata/SRR3330397_1.fastq /scratch/botany/katie/orchid/ParentalRNAseq/incarnata/SRR3330397_2.fastq
Am a bit baffled by this :-) . Two things to try:
1) Could you try replacing the tabs in your samples TSV file with spaces?
2) If that doesn't work, what do the following commands output?
SAMPLES=`cut -d ' ' -f -1 incarnata_sample.tsv | paste -d " " -s`
echo "${SAMPLES}" | awk -F' ' '{print NF}'
Aha! It was the spaces that did it! thanks so much for helping with that, sorry, my bad for incorrectly formatting the file! :D
Not your fault, it's not unreasonable to use tabs in a "TSV" file 😂. Weirdly using tabs or spaces in the samples file works for us here, so there must be something different about your execution environment, I guess?. I assume it's a Linux machine that you're running on? What version of bash does it have?
Yep linux! Not sure if its the info you were looking for but this is what I think is the bash version:
echo "${BASH_VERSION}"
4.4.20(1)-release
Hmm, exactly the same version as here! So, I don't know why it wasn't working for you with tabs - but anyway, since we've managed to find a workaround to make it work, I'll go ahead and close this?
Hello!
I am having a go at running the sargasso_parameter_test script but I'm running into an error which I can't seem to work out.
My command:
my sample tsv file:
I'm getting the error:
I've tried to match the names of the sample everywhere I think, but no luck. Do you know what could be the problem?
Thank you! :)
Katie