Closed vezzi closed 9 years ago
I'll look into this tomorrow.
@vezzi Checkout 64a892b203104e947e504841873031753afa07ef, I think that this is fixed now. I tested on the files that you sent me, and everything looks ok now. Let me know if there are any further problems with this, or if we can close this issue.
Great Johan... I am pretty close to start updating charon with alignment results and after that I only need to write the part that triggers the best practice analysis....
Maybe Friday I will be able to run the pipeline on the 7 samples we have so far!!!!!
:thumbsup: Can you confirm that this is working now? Just want to make sure that I haven't missed anything. Looking forward to them moment when all parts are in place and we can push a run through it.
I will give a try to it this afternoon right now I do not want to mess too much with my current test folder..
Ok. :smile_cat:
Not yet solved....
sthlm2UUSNP -i /proj/a2010002/nobackup/NGI/analysis_ready/DATA/M.Kaller_14_06/ -o /proj/a2010002/nobackup/NGI/analysis_ready/DATA_UUSNP/M.Kaller_14_06
wc -l /proj/a2010002/nobackup/NGI/analysis_ready/DATA_UUSNP/M.Kaller_14_06/140702_AC41A2ANXX/report.tsv
33
sthlm2UUSNP -i /proj/a2010002/nobackup/NGI/analysis_ready/DATA/M.Kaller_14_06/ -o /proj/a2010002/nobackup/NGI/analysis_ready/DATA_UUSNP/M.Kaller_14_06
wc -l
65
can you remove the append and recreate each time the file
The problem is that right now I am recreating the tsv file one time for each sample.. this needs to change but for now it simplifies my life
Do you run it without removing the old folder structure first? That will probably cause this problem. I'd recommend removing the old folder structure and then recreating it. Since it's all hard links there is really no cost to doing so.
However if it's important to you that running the same command more than once and appending the file only with the new information, I can fix that by adding some extra logic to the app.
The second time I rerun it I do it without removing the previously created folder structure.
I need it?... right now I do not really like what I do: given a flowcell I scan one by one the samples of that flowcell and I do:
clearly rerun sth2UUSNPSEQ for each sample is not optimal but avoids me, in this moment, to add too much logic and check if for that flowcell I have already created or not the UUSNPSEQ folder structure.
Anyway, when a new flowcell is delivered I need to rerun Sthl2UUSNPSEQ and this will recreate the current problem. I cannot delete the folder structure as in that moment it could be that the data is used by a running instance of piper.
The best solution would be to call sthl2UUSNPSEQ at flowcell level, like this
sthl2UUSNPSEQ NGI_project_format UUSNPSEQ_dir FLOWCELL
this will create a new directory in UUSENPSEQ_dir with the new FLOWCELL run. In this way I can check if UUSNPSEQ_dir/FLOWCELL exists or not and decide if run or not the command. At that point you are not required to add extra logic to avoid append already existing fileds.
I do not know which one is the best solution for you (or the simplest). I would prefer the second one (build FLOWCELL specific UUSNPSEQ folders) but if for you modify the current version is easier is fine with me.
Just checking that this is what you want, and if so it should be easy to implement:
./sthlm2UUSNP --input_root <sthlm project root folder> --out_root <root of uppsala style project> --flowcell <restrict the creation to the following flowcell>
Just give me a thumbs up that this is what you want, and I'll fix it asap.
Thumbs up
Sent from my iPad
On 11 Aug 2014, at 16:09, Johan Dahlberg notifications@github.com wrote:
Just checking that this is what you want, and if so it should be easy to implement:
./sthlm2UUSNP --input_root
--out_root --flowcell Just give me a thumbs up that this is what you want, and I'll fix it asap. — Reply to this email directly or view it on GitHub.
This should be fixed from: v1.2.0-beta16
There is an example of how to run it:
./target/pack/bin/sthlm2UUSNP --input_root src/test/resources/testdata/Sthlm2UUTests/sthlm_runfolder_root --out_root test --flowcell 140710_AC41A2ANXX
@vezzi test it and see if it does what you want it to do.
I tried it, and if possible I would like to overwrite the existing .*tsv file rather then appending it.
If I run
sthlm2UUSNP --input_root /proj/a2010002/nobackup/NGI/analysis_ready/DATA/A.Wedell_13_03/ -o /proj/a2010002/nobackup/NGI/analysis_ready/DATA_UUSNP/A.Wedell_13_03/ --flowcell 130611_AH0CCVADXX
on a project I already converted i get:
cat A.Wedell_13_03/130611_AH0CCVADXX/*tsv
#SampleName Lane ReadLibrary FlowcellId
P567_101 1 A AH0CCVADXX
P567_101 2 A AH0CCVADXX
P567_101 1 A AH0CCVADXX
P567_101 2 A AH0CCVADXX
i.e., the last two lines are repeated. We can check this pipeline side with no big effort (and we will do it anyway to avoid recreating files that have been already created) but to me this appending issue sounds like an unexpected behaviour of the program.
You are absolutely right that this is an unexpected behavior. I'll make sure to fix it now. Will let you know when it's done.
@vezzi Try it now!
@johandahlberg works @mariogiov sthlm2UUSNP works as expected you can remove the patch I added in piper_ngi
:smiley_cat:
So I am trying to run some large test but I noticed that there is a bug in sthlm2UUSNP
When running this command:
I get the following, correct, folder structure:
but the report.tsv file looks like this:
only one sample is present. 24 lines (8 for each sample) are missing. THis clearly causes piper to fail.
In case you need it here you can see the complete folder structure (IGN format)