Closed lucaskampman closed 2 years ago
What a bummer!
Thanks a lot. Yes the -1 is wronger than wrong! I didn't recognize it, because I am using this script (which obviously has not the error):
import sys
import pandas as pd
import numpy as np
import dask.dataframe as dd
import click
# options
@click.command()
@click.option('--condition',
required=True,
type=str,
help='Name of the condition.')
@click.option('--counts',
'counts',
required=True,
nargs=2,
multiple=True,
type=(str, click.Path(exists=True, readable=True)),
help='Replicate name and Count file. Can be used multiple times')
@click.option('--output',
'output_file',
required=True,
type=click.Path(writable=True),
help='Output file.')
def cli(condition, counts, output_file):
dk_full_df= None
for replicate_count in counts:
rep=replicate_count[0]
file=replicate_count[1]
#DNA 1 (condition A, replicate 1)
colnames=["Barcode", "DNA %s (condition %s, replicate %s)" % (rep,condition,rep),
"RNA %s (condition %s, replicate %s)" % (rep,condition,rep)]
cur=pd.DataFrame(pd.read_csv(file, sep='\t', header=None))
print(cur.head())
cur.columns=colnames
cur_dk=dd.from_pandas(cur,npartitions=1)
print(cur.head())
if (dk_full_df is not None):
tmp=dd.merge(dk_full_df,cur_dk, on=["Barcode"],how='outer')
dk_full_df=tmp
else:
dk_full_df=cur_dk
print(dk_full_df.head())
dk_full_df=dk_full_df[sorted(dk_full_df.columns)]
print(dk_full_df.head())
dk_full_df.compute().to_csv(output_file, index=False)
if __name__ == '__main__':
cli()
I will update it and create a new version release v2.3.2
Just to frame the error. It appears only when count.nf
is used the with the option
--mpranalyze`
Hi,
I think I found a small typo in line 16 of merge_all.py that caused the mpranalyze code to only process replicates
1
through(n-1)
in a data set withn
replicates.My .command.sh file reads as follows, with arguments for each of my 3 replicates:
The following code in
merge_all.py
drops the last argument, only iterating over sys.argv[3] and sys.argv[4], even though sys.argv[5] should be included:A quick fix was replacing the for loop line above with "for i in range(3,(len(sys.argv)-replicates)):"
Hope that makes sense — let me know if there's anything I can clarify!
All the best, Lucas