Open Jalalalzanin opened 3 years ago
Hi @Jalalalzanin, there are some fixes in #1547 but I'm not sure we caught this one yet. I'll have a look!
thank you @hexylena
also under the subtitle of Aggregating data
the parameters of "Datamash (operations on tabular data)" tool “Group by fields”: 1
“Operation to perform on each group”:
“Type”: Count
“On column”: Column: 1
the results are not identical as explained in the tutorial, I changed the parameters as following “Group by fields”: 2
and
“Type”: Count
“On column”: Column: 2
the results then came correct
@shiltemann if you have time to test #1547, I think both of these issues should be fixed.
Hi @Jalalalzanin. We have published a new version of the tutorial. Do you perhaps have some time to test this version out? https://training.galaxyproject.org/training-material/topics/assembly/tutorials/ecoli_comparison/tutorial.html
Hi @hexylena, definitely I will do that thank you for your efforts
Hi @hexylena, regarding to the new updated version of the tutorial, sorry I have sme notes that I found at the begining for uploding complete genomes from NCBI the option of Tab-delimited file is not available, instaed of that TSV is only found. I used the old file from previous version of the tutorial, however when Cut tool is used to prepaer the file the output file as shown below and this the input parameters of the tool
when moved to the next step using Rule-based to upload the sequences to Galaxy there was an error with Regular Expression (this step was repeated many times and the same error was noticed)
BTW the cut tool are available in two options the last one highlighted with red line was used
thank you
Hi @Jalalalzanin thanks for testing this out! NCBI seems to have changed the format of their table, so I'm rewriting that bit. Thanks for the detailed report!
For the multiple cut tools, the new galaxyproject/galaxy#10024 feature will hopefully fix that. I will annotate the tools appropriately.
my pleasure @hexylena Yes, NCBI was noticed to be updated since the begning of 2020. I will skip the uploding sequences into Galaxy to the step of Comparing genome architectures as provided in the tutorial, if I found any problems i will mention that here (sorry to bother you) . thank you
If I found any problems i will mention that here (sorry to bother you)
Please do! We really appreciate this reviewing help :) Thanks for doing this.
it's my pleasure @hexylena many thanks for you and all Galaxy team for your efforts
In subtitle “Convert LASTZ output to BED” Explanation of “Converting to BED” also, in step 7 But to get the results as provided in the tutorial script the parameters changed as following
thank you
Hi@hexylena in subtitle "Extract CDSs from annotation datasets" step 4 the parameters of Collapse Collection tool required for this step are not available
thanks for help
step 4 the parameters of Collapse Collection tool required for this step are not available
Ok this one is funny :) It was a bug in that version of the tool which I fixed. So it's prepend instead of append and I failed to update the training accordingly.
I've pushed fixes for all of these on my branch, thanks again for reviewing!
So, I will continue the tutorial after the updating thank you @hexylena
The tutorial has been updated in https://github.com/galaxyproject/training-material/pull/2016 if you @Jalalalzanin , or anyone else wants to check further.
I will check the tutorial within the coming days, if there is any problem, I will send you the feedback . thank you @hexylena for your efforts
Awesome, thanks so much @Jalalalzanin!!
My pleasure @hexylena
@hexylena @hexylena started the tutorial from the step Comparing genome architectures after downloading the direct link provided at the beginning of the tutorial (later I will check the steps of downloading step by step)
under subtitle "Getting sequences and annotations" when cut tools used with the provided parameters the results came different bc c10,c15 columns were selected, but the correct columns to be selected should be are C11, C19 to get the correct results as explained in the manuscript this the results when the c10,c15 parameters are used and this is when I changed the parameters to C11, C19 it came correct as you explained in the tutorial
in the next step problem with regular expression found as follows:
Ok, the c10/15 one looks like it is used twice, and I forgot to update one of them. And I used a different value in my galaxy history than works with the zenodo dataset, I guess, because the format of the TSV file changed.
I've rewritten the intro to say "please use this dataset from zenodo" to avoid this issue in the future.
For the regex, I updated it for using column 8, rather than column 11. Can you retry with c8,c20
instead of c19? That's the refseq url rather than genbank, so that will break in other ways.
https://github.com/galaxyproject/training-material/pull/2072 this should ensure we're all using the zenodo dataset, that the correct columns (c8, c20) are used which produce the following results:
GCA_002079225.1 ftp://ftp.ncbi.nlm.nih.gov/genomes/all/GCA/002/079/225/GCA_002079225.1_ASM207922v1
GCA_002761835.1 ftp://ftp.ncbi.nlm.nih.gov/genomes/all/GCA/002/761/835/GCA_002761835.1_ASM276183v1
GCA_900186905.1 ftp://ftp.ncbi.nlm.nih.gov/genomes/all/GCA/900/186/905/GCA_900186905.1_49923_G01
Yes, I repeated that and it is working well now thank you @hexylena for quick response
Hi @hexylena tool "Replace Text" not available in the list of tools
Hi @Jalalalzanin which server are you using? that tool is available on EU https://usegalaxy.eu/root?tool_id=toolshed.g2.bx.psu.edu/repos/iuc/datamash_ops/datamash_ops/1.1.0
@hexylena I am using galaxy Europe as well and searched for it but not available
from the link you mentioned it is available thank you
Hi @hexylena I faced some problems so I repeated the tutorial form the beginning without using zenodo dataset instead NCBI data used, the output file of "Select lines that match an expression" contains only 16 columns and the last two seem similar
when the cut tool with 8,20 used the result was not correct so I changed the parameters to 6,15
Hi @Jalalalzanin I'll check this but could you please use only the zenodo dataset?
I do not want to use NCBI's because then this tutorial needs updates every time they change it :(
@hexylena I used zenodo dataset before but some problems faced with Replace Text tool the problem that "LASTZ Alignments" collection not appear on the collection data so I just dragged it from the history list
but the output file looks like that's why I back to the beginning of the tutorial and used the NCBI data
thanks for help
Would it be possible you share your history? So I can see what went wrong with the replace text step?
Perfect! That's super helpful. I'll have a look now.
Ahh ok,
blastn
formatted alignments.that's ok @hexylena maybe the mistake is from my side for the "E coli c + relatives" I will check again the steps
Sounds good, thanks for checking :)
you welcome @hexylena I appreciate your help bc this will help me in my work
Hello everyone I faced a problem with the output of "Search in textfiles (grep)" tool in this tutorial exactly with the “Regular Expression”: ^> the output is empty, i dont have experince with this Regular Expressions but when i deleted the Regular Expression the uotput of the tool showed three lines as follows:
CP020543.1 CP024090.1 LT906474.1
however in tutorial it should be come like this
Thank you in advanced for your help